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Preface 


This  study  had  a  major  goal  of  extending  Markov  simulation  methods, 
especially  as  applied  to  weather  data,  to  two  or  more  dimensions.  These 
methods  should  be  computationally  simple,  since  they  would  be  designed 
to  produce  a  weather  data  field  as  only  one  of  many  possible  inputs  to  a 
more  complex  simulation  model  used  for  another  purpose,  such  as  weapon 
system  effectiveness.  Examples  performed  using  the  models  described 
in  this  paper  show  that  most  of  these  methods  are  realistic  and  all  of 
them  are  quite  easy  to  apply. 
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Department  for  suggesting  this  topic  to  me  and  for  his  constant  guidance 
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study  as  much  as  possible.  Also,  I  would  like  to  thank  Capt  Garry 
Jackson  of  AFFDL  for  his  ideas,  which  led  to  the  formulation  of  this 
problem,  and  for  his  constant  interest  in  the  outcome  of  the  research, 
and  I  would  like  to  thank  Lt  Col  Jon  Hobbs  from  the  AFIT  Systems 
Management  Department  for  his  helpful  suggestions  during  the  writing  of 
this  thesis.  Finally,  I  would  like  to  thank  God,  who  constantly  sus¬ 
tained  me  throughout  this  period. 
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Abstract 

In  many  cases,  weather  is  only  one  of  many  inputs  to  a  simulation 
model,  so  a  realistic  but  simple  weather  simulation  method  should  be 
included  in  the  model.  This  paper  has  three  major  areas  of  concern: 

( 1 )  A  fairly  extensive  review  of  applications  of  one-dimensional  Markov 
and  semi -Markov  chains  to  weather  data  simulations,  (2)  A  consideration 
of  factors  involved  in  and  methods  that  are  appropriate  for  extending 
Markov  concepts  to  simulations  of  gridded  data  in  two  or  more  dimen¬ 
sions,  and  (3)  Evaluation  of  the  proposed  methods  in  terms  of  realism 
and  simplicity  of  application.  A  discussion  of  the  general  character¬ 
istics  of  real  weather  variables  and  observed  weather  data  in  the  con¬ 
text  of  simulating  weather  as  a  stochastic  process  is  also  given. 

The  data  base  used  for  the  example  consisted  of  gridded  weekly 
maps  of  temperature  departures  from  normal  in  an  area  of  the  United 
States.  For  most  analyses,  the  data  was  converted  to  five  states, 
frcm  state  1  (coldest)  to  state  5  (wannest).  In  the  real  data,  it  was 
rare  to  have  an  occurrence  of  unequal  or  nonconsecutive  states  in 
adjacent  grid  points.  Such  occurrences  were  called  "unusual  transitions,” 
and  one  criterion  for  evaluating  the  realism  of  a  weather  simulation 
scheme  was  the  frequency  of  generating  these  transitions. 

Most  of  the  proposed  two-dimensional  simulation  methods  produced 
data  fields  that  were  basically  quite  realistic,  but  they  also  pro¬ 
duced  "unusual  transitions"  more  frequently  than  in  the  original  data. 

A  further  suggested  method,  no  more  complex  than  the  other  methods,  was 
found  to  greatly  reduce  the  frequency  of  these  "unusual  transitions" 
in  the  simulated  data  fields.  In  this  method,  the  probability  distri¬ 
bution  at  any  interior  point  of  the  grid  depends  on  the  states  observed 


vii 


at  the  grid  points  imnediately  west,  north,  and  northeast 
point,  and  the  interior  of  the  grid  is  simulated  by  rows, 
proposed  simulation  methods  are  very  easy  to  perform  on  a 


of  each 
All  of  the 
computer. 
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MARKOV  SIMULATIONS  OF  ONE-  ANT  TWO-DL'ENSIONAL 
WEATHER  DATA  BASES 

I.  Introduction 

Overview  of  Weather  Modeling 

Weather  is  still  an  important  factor  in  the  effectiveness  of  al¬ 
most  all  weapon  systems.  Fighter  aircraft,  heat-seeking  missiles, 
helicopters,  and  cruise  missiles  are  only  a  few  of  the  systems  whose 
performance  can  be  significantly  degraded  by  certain  adverse  weather 
conditions  such  as  icing,  low  ceilings,  or  low  visibility.  Since 
these  conditions  occur  very  frequently  in  many  areas  of  military  inter¬ 
est,  especially  Europe,  weather  is  a  major  concern  of  combat  planners 
and  needs  to  be  modeled  so  it  can  be  incorporated  into  planning  pro¬ 
cedures  . 

Categories  of  weather  models.  Weather  models  can  be  divided  inco 
three  basic  categories:  deterministic  models,  stochastic  models,  and 
models  that  are  a  combination  of  both. 

(1)  Deterministic  weather  models  use  initial  input  conditions 
abased  on  weather  observations!  and  physical  laws  of  the  atmosphere  to 
produce  explicit  forecasts  of  conditions  at  a  later  tine.  The  scales 
of  motion  and  interactions  involved  require  modeling  at  least  the 
Northern  or  Southern  Hemisphere  for  realistic  results,  if  forecasts 
are  made  for  a  period  of  more  than  a  few  hours.  Small-scale  or  tempor¬ 
ary  phenomena  such  as  thunderstorms  or  fog  cannot  be  adequately  modeled, 
and  accumulated  errors  cause  accuracy  to  decrease  substantially  after 
about  24  hours.  Because  the  physical  laws  are  universal,  the  models 
may  be  applied  almost  anywhere  (even  on  other  planets  with  an  atmosphere). 


1 


i2)  Stochastic  models  are  based  on  climatology,  or  a  knowledge 
of  past  weather  conditions.  Climatology  is  usually  a  statistical 
description  of  the  weather  and  is  often  expressed  in  the  fora  of 
probabilities  of  occurrences  of  given  weather  conditions.  A  forecast 
made  by  a  stochastic  model  will  consist  of  the  probabilities  of  various 
outcomes,  based  on  current  and  recent  weather  conditions.  Because  this 
is  3  statistical  rather  than  a  meteorological  approach,  probabilities 
must  be  generated  for  each  separate  location.  The  data  may  be  further 
stratified  by  season  of  year  and  time  of  day. 

(3)  Many  operational  computer-based  models  are  combinations  of 
the  deterministic  and  stochastic  types.  An  example  is  a  "probability 
of  precipitation"  forecast.  The  expected  conditions  at  a  given  point 
over  a  specified  time  interval  are  forecast  deterministically.  The 
uncertainty  due  to  inaccuracy  of  the  forecast  is  expressed  in  proba¬ 
bilistic  fora,  with  the  probabilities  determined  from  an  empirical 
model  based  on  the  forecast  model  outputs. 

Basic  reasons  for  weather  modeling.  The  major  reasons  for  de¬ 
siring  to  model  the  weather  can  be  divided  into  two  categories:  short- 
range  planning  and  long-range  planning.  Short-range  planning  as  used 
here  refers  to  the  time  period  for  which  specific  forecasts  are  availa¬ 
ble,  generally  no  more  than  a  few  days  into  the  future.  Long-range 
planning  is  considered  to  concern  time  periods  for  which  no  forecast  is 
available.  Long-range  planning  can  also  include  attempts  to  understand 
present  and  past  climates  (including  the  ice  ages),  since  any  past 
climate  could  possibly  recur  in  the  future. 

For  short-range  forecasting,  deterministic  models  are  quite 
effective  in  most  cases  and  are  routinely  produced  at  major  forecasting 


centers.  Some  small-scale  phenomena  that  cannot  be  adequately  modeled 
can  be  forecast  stochastically  using  conditional  probabilities  (such 
as  the  probability  of  fog  tomorrow  morning,  given  the  weather  condi¬ 
tions  observed  today)  obtained  from  a  "conditional  climatology." 

Long-range  weather  planning  is  most  effectively  accomplished 
using  a  stochastic  approach,  since  no  exact  knowledge  of  weather  con¬ 
ditions  can  be  obtained  beyond  a  certain  distance  into  the  future. 
Simulation  is  a  tool  frequently  used  in  long-range  planning.  Deter¬ 
ministic  models  have  had  some  use  in  investigating  past  climates  (such 
as  in  simulation  of  the  last  ice  age),  but  the  purpose  has  only  been  to 
obtain  estimates  of  averages.  In  general,  deterministic  models  are  too 
complex  and  use  too  much  computer  time  for  use  in  most  simulations  of 
Air  Force  interest,  while  stochastic  models  (such  as  Markov  chains)  can 
be  computationally  simple  and  still  provide  sufficient  realism. 

Simulation  and  Weather  Modeling 

Much  work  has  been  done  to  simulate  the  performance  of  weapon 
systems  under  varying  combat  situations.  Weather  is  often  included  as 
a  factor  in  the  simulation  models,  with  varying  degrees  of  realism. 

In  general,  the  weather  module  of  a  simulation  model  should  be  as 
computationally  simple  as  possible  and  should  not  introduce  any  more 
complexity  than  necessary.  However,  it  should  include  enough  of  the 
characteristics  of  the  actual  weather  (such  as  frequencies  of  occur¬ 
rence,  persistence,  and  variability,  in  space  as  well  as  time  if  needed) 
to  be  realistic  in  terms  of  the  weapon  systems  and  scenarios  consid¬ 
ered.  Since  a  simulation  is  generally  assumed  to  take  place  at  some 
unspecified  future  time,  for  which  the  only  thing  known  about  the 
weather  conditions  is  that  they  will  be  different  from  present  and  past 


weather,  an  appropriate  approach  is  to  describe  the  available  weather 
data  base  statistically  in  terns  of  probability  distributions.  Then, 
randan  procedures  would  be  used  to  generate  a  statistically  reasonable 
weather  sequence,  which  would  probably  not  be  the  same  as  any  actually 
observed  weather  sequence.  Only  the  parameters  th3t  generate  the 
sequence  would  need  to  be  stored  In  the  computer,  rather  than  the  en¬ 
tire  series  of  available  data. 

One  problem  in  any  simulation  is  to  decide  what  weather  data  (if 
any)  i3  needed.  This  will  not  be  a  major  concern  of  this  thesis,  since 
meteorological  reasoning  and  experimental  data  can  be  used  on  a  case- 
by-case  basis  to  determine  what  data  needs  to  be  included  in  each  simu¬ 
lation.  It  i3  assumed  here  that  data  is  already  available  or  can  be 
obtained,  since  for  mo3t  land  areas  of  the  world,  the  weather  has  been 
continuously  observed  at  specific  locations  for  20  to  100  years  or  more. 
Most  of  tliis  data  is  archived  in  computer-readable  format  at  the  Air’ 
Force  Environmental  Technical  Applications  Center  (ETAC),  Scott  Air 
Force  Base,  Illinois,  and  the  National  Climatic  Center  (NCC),  Asheville, 
North  Carolina.  Most  detailed  data  bases  covering  areas  of  the  earth's 
surface  or  volumes  of  the  earth's  atmosphere  on  a  small  scale  have  been 
available  only  since  the  early  1970's.  The  major  Air  Force  data  base 
of  this  type  i3  the  three-dimensional  nephanalysis  data  base  (3D  NEFH), 
archived  at  ETAC,  which  incorporates  weather  satellite  observations  and 
surface  and  upper-air  weather  data  to  provide  a  worldwide  15-layer 
analysis  of  atmospheric  moisture  and  cloud  cover  (nephanalysis) ,  with  a 
resolution  of  approximately  25  nautical  miles.  On  a  largo  scale,  dally 
Northern  Hemisphere  surface  weather  imps  have  been  published  back  to 
1399,  and  upper-air  weather  maps  are  available  back  to  1940  in  the  Hally 
Cories ,  Synopt lc  Weather  Maps  (D.S.  Department  of  Commerce,  W40  to 


Present)  and  Historical  Weather  Maps  !U.S.  Department  of  Commerce,  1899 
to  1939)  series.  This  paragraph,  of  course,  is  not  an  exhaustive  list 
of  available  data,  but  illustrates  the  fact  that  weather  data  is  already 
available  for  almost  any  desired  purpose. 

The  next  problem  in  incorporating  weather  into  a  simulation  model 
is  to  appropriately  describe  the  data  so  that  the  data  characteristics 
that  are  important  for  the  simulation  are  included  in  the  model.  For 
example,  to  simulate  a  long,  large-scale  military  exercise,  or  a  con¬ 
ventional  war  lasting  a  year  or  more,  it  may  be  appropriate  to  assume 
average  weather  conditions  or  conditions  that  match  the  normal  proba¬ 
bility  distributions  of  the  weather  phenomena.  To  simulate  a  shorter- 
duration  event,  such  as  a  Warsaw  Pact  attack  on  western  Europe,  both 
changes  and  persistence  from  day  to  day  should  be  included  in  the  model. 
Different  prevailing  weather  conditions  during  the  period  of  this  event 
(such  as  a  severe  winter  cold  spell,  a  mild  and  rainy  winter  period,  a 
typical  showery  period  in  the  sunmer,  or  a  summer  drought  and  heat  wave) 
would  lead  to  considerable  differences  in  the  effectiveness  of  weapon 
systems  against  various  types  of  targets.  In  predicting  the  fuel  costs 
for  a  building  (or  a  base)  during  a  heating  season,  the  overall  severity 
of  the  winter  would  be  the  important  factor,  rather  than  individual 
cold  waves.  However,  to  predict  the  performance  of  a  solar  heating 
system  (possibly  installed  in  the  same  building),  day-to-day  factors, 
such  as  cold  or  cloudy  periods,  are  the  most  important  factors. 

Many  procedures  have  been  extensively  developed  and  theoretically 
Justified  for  providing  a  statistical  description  of  the  climate  at  a 
given  location.  A  summary  of  the  weather  data  that  is  recorded  at  a 
single  station  over  a  period  of  time  can  be  called  a  "point  climatolo¬ 
gy."  For  most  stations  of  military  interest,  point  climatologies  are 


already  available  from  ETAC  or  NCC.  However,  published  statistical 
summaries  usually  do  not  include  data  about  persistence,  or  positive 
correlation  between  observations,  which  must  be  considered  in  most 
situations  that  extend  over  a  fairly  short  period  of  time.  When  there 
is  no  persistence,  a  simulated  data  sequence  can  be  generated  by 
Bernoulli  trials,  or  repeated  randan  sampling  from  the  unconditional 
probability  distribution  of  weather  states.  If  there  is  persistence, 
data  sequences  can  be  generated  using  Markov  or  semi -Markov  processes. 
Efforts  in  this  area  are  described  more  fully  in  the  literature  survey 
section. 

The  procedures  for  developing  an  "area  climatology or  a  statisti¬ 
cal  description  of  the  characteristics  of  the  weather  conditions  sim¬ 
ultaneously  occurring  over  an  area  or  in  a  given  volume  of  atmosphere, 
are  not  very  well  explored,  especially  for  weather  occurring  on  a  small 
scale.  The  data  that  is  available  in  map  or  gridded  data  form  is  con¬ 
cerned  almost  exclusively  with  averages  or  extremes  over  large  areas, 
such  as  in  the  cl imat ic  Atlas  of  the  United  States  iU.S.  Department  of 
Comnerce,  W6S).  In  a  simulation  concerned  with  a  small  ai'ea  over  a 
short  period  of  time,  such  a?  aerial  combat,  the  significant  meteoro¬ 
logical  factors  are  often  not  well  defined,  or  ai'e  not  based  on 
routinely  measured  weather  data.  (For  example,  horizontal  and  vertical 
visibilities  in  the  atmosphere  are  not  always  the  same.  Visibility 
observations  from  an  airplane  are  usually  approximate  and  are  infre¬ 
quently  reported).  Generally,  the  appropriate  statistical  distribu¬ 
tions  must  be  obtained  on  a  case-by-case  basis  fivm  the  raw  data  that 


is  used. 


Motivation  for  This  Project 

It  has  been  found  that  the  cruise  missile  is  quite  vulnerable  to 
ice  accumulation,  since  it  is  designed  to  fly  low,  with  little  reserve 
power  or  fuel.  In  winter  in  eastern  Europe  and  the  USSR,  icing  condi¬ 
tions  are  frequent  and  may  cover  a  large  area  for  as  long  as  several 
days. 

Once  this  problem  was  identified,  it  was  desired  to  find  out  the 
expected  proportion  of  cruise  missiles  that  could  be  lost  under  various 
realistic  operating  conditions.  If  the  number  lost  could  be  large, 
icing  protection  may  need  to  be  built  into  the  cruise  missile  even 
though  this  could  reduce  its  payload  and  range. 

ETAC  was  given  responsibility  to  investigate  this  problem.  Their 
approach  was  to  select  a  given  area  of  3D  NEPH  data  and  count  the  num¬ 
ber  and  duration  of  icing  encounters  (occurrences  of  specified  combina¬ 
tions  of  temperature  and  liquid  water  content)  along  a  simulated  path 
as  in  figure  1  for  each  data  time  available.  The  probabilities  of 
encounter  of  these  icing  situations  could  be  used  to  estimate  the 
probability  of  losing  a  cruise  missile  due  to  icing. 

If  further  simulations  are  desired  after  this  data  is  analyzed, 
there  are  many  possible  approaches  to  developing  a  simulation  model, 
two  of  which  are  of  interest  here.  In  both  cases,  mutually  exclusive 
weather  conditions  (or  states)  that  correspond  to  different  intensi¬ 
ties  of  cruise  missile  icing  could  be  defined. 

The  first  approach  is  most  suitable  if  the  probability  of  loss 
of  a  single  cruise  missile  due  to  icing  is  to  be  investigated.  In 
this  case,  a  one-dimensional  approach  that  would  simulate  a  sequence  of 
weather  conditions  along  a  path  could  be  used.  This  could  be  either 
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Figure  1.  Simulated  Path  Used  by  ETAC 
to  Analyze  Icing  Duration 

a  Markov  or  a  semi-Markov  model,  and  chapter  IV  describes  some  examples 
of  applications  of  both  types  of  models. 

The  second  method  is  more  appropriate  if  the  paths  of  a  large 
number  of  cruise  missiles  launched  during  a  short  time  interval  are  to 
be  simulated.  In  this  case,  a  two-dimensional  approach  should  be  used, 
assuming  that  the  cruise  missiles  are  to  fly  at  nearly  a  constant  level 
above  the  ground.  A  two-dimensional  data  field  would  be  generated  and 
a  large  number  of  cruise  missile  paths  through  this  data  field  would  be 
simulated.  The  proportion  of  cruise  missiles  lost  due  to  icing  would  be 
recorded,  and  this  procedure  would  be  repeated  with  different  generated 
data  bases  until  the  desired  statistics  have  been  found. 

Goals  of  This  Thesis 

In  view  of  the  preceding  areas  of  concern,  the  first  goal  of  this 
thesis  is  to  summarize  the  existing  literature  in  the  area  of  simulation 
using  one-dimensional  Markov  and  semi-Markov  processes,  especially  as 
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applied  to  weather  data,  for  the  benefit  of  users.  The  second  goal  is 
to  investigate  the  possibility  of  extending  Markov  concepts  to  two 
dimensions,  and  to  report  on  work  that  has  already  been  done  in  this 
area.  The  third  goal  is  to  perform  examples  of  the  suggested  simulation 
procedures  to  show  the  feasibility  of  the  methods  and  to  reveal  any 
operational  problems.  Basically,  the  two-dimensional  modeling  proce¬ 
dures  are  concerned  with  simulating  events  occurring  over  an  area  at  an 
instant  of  time.  These  concepts  may  be  extended  to  n  dimensions  (in¬ 
cluding  time  as  one  of  the  possible  dimensions),  although  complexity  of 
the  model  would  increase  greatly. 

Because  the  focus  of  this  investigation  is  statistical  rather  than 
meteorological,  it  will  not  be  primarily  concerned  with  meteorological 
processes.  Markov  chains  provide  a  statistical  description  of,  rather 
than  a  physical  explanation  for,  weather  phenomena.  Of  course,  when  a 
large  data  base  is  available,  the  statistical  description  should  agree 
with  physical  explanations. 

The  purpose  of  this  study  will  be  to  develop  procedures  suitable 
for  simulation  rather  than  forecasting,  although  the  procedures  could 
be  applied  to  a  "conditional  climatology"  approach  to  short-range  fore¬ 
casting. 

Overview 

Chapter  II  describes  some  of  the  characteristics  of  actual  weather 
variables  and  observed  weather  data. 

Chapter  III  briefly  summarizes  some  of  the  theory  of  appropriate 
Markov  and  semi-Markov  processes  and  defines  terms  and  notation  to  be 
used  in  this  report. 

Chapter  IV  describes  some  of  the  applications  of  Markov  processes 
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to  weather  data  series,  including  first-order  and  higher-order  Markov 
chains,  semi -Markov  processes,  and  statistical  tests  of  Markov  order. 

Qiapter  V  discusses  several  additional  topics  that  become  especially 
important  when  attempting  to  model  a  two-dimensional  data  field  by  a 
Markov  process,  including  reversibility  and  inverse  probabilities,  con¬ 
ditional  probabilities,  and  correlation. 

Chapter  VI  describes  a  regress ion- type  scheme  for  simulating  a 
two-dimensional  data  base  as  a  preliminary  to  extending  Markov  concepts 
to  two  or  more  dimensions.  The  methods  Galbraith  and  Walley  (1976)  and 
Pickard  (1977)  used  to  simulate  a  two-dimensional  data  field  by  Markov 
procedures  are  described,  and  an  additional  suggested  procedure  is  shown. 

Qiapter  VII  gives  examples  of  simulations  produced  by  the  Markov 
models  described  in  chapter  VI.  These  examples  show  that  all  of  the 
suggested  methods  can  generate  combinations  of  weather  states  that  are 
unusual  in  the  real  data.  A  suggested  refinement  of  Pickard's  method 
reduces  the  frequency  of  these  unusual  combinations.  Three  of  the  four 
suggested  methods  do  produce  data  fields  that  are  reasonably  realistic, 
and  all  of  the  simulation  methods  are  easy  to  perform  on  a  computer. 

Qiapter  VIII  summarizes  the  results  and  identifies  some  areas  of 


possible  further  research. 


II.  Weather  Data  Characteristics  and  Analysis 

Introduction 

Weather  is  naturally  occurring  and  is  difficult  or  impossible  to 
reproduce  under  laboratory  conditions,  so  it  must  be  observed  to  be 
described,  explained,  and  understood.  Weather  observations  seek  to 
describe  various  conditions  that  can  be  extracted  from  the  environment, 
are  measurable,  and  appear  significant.  It  is  important  to  know  what 
the  major  characteristics  of  the  underlying  real  variables  are,  and  how 
the  observed  data  differs  from  the  real  variables.  The  extensive 
general  discussion  of  weather  variables  in  this  chapter  is  prepared  for 
the  benefit  of  users  who  may  not  have  extensive  experience  in  working 
with  weather  data.  Also,  the  discussion  of  data  analysis  is  given  for 
those  who  may  not  be  familiar  with  some  of  the  techniques  used. 

As  described  before,  the  basic  purposes  of  a  weather  model  are  to 
aid  in  short-range  and  long-range  p lam i tig.  Current  weather  observa¬ 
tions  are  a  key  to  forecasting  the  future  wea ther,  and  a  long  series  of 
observations  is  needed  to  provide  an  adequate  climatology  in  the  form 
of  a  statistical  description  of  the  climate  of  a  location.  Climatolo¬ 
gical  data  is  used  for  various  planning  studies  throughout  the  Air 
Force,  including  simulation  models. 

This  chapter  will  show  that  available  weather  data  is  in,  or  can  be 
transferred  into,  a  form  suitable  for  modeling  as  a  stochastic  process. 
However,  the  meteorological  processes  involved  and  the  limitations  on 
the  data  should  be  considered  to  avoid  obtaining  useless  results. 

It  is  possible  to  construct  weather  variables  that  will  not  haw 
all  of  the  properties  named  in  this  chapter,  but  these  variables  can  be 
converted  into  variables  that  are  suitable  for  stochastic  modeling. 
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For  example,  the  cumulative  precipitation  measured  at  a  station  since 
its  establishment  could  be  divided  into  two  states:  (1)  under  1000 
inches,  and  (2)  1000  inches  or  over.  Once  1000  inches  has  been  observed, 
state  2  is  entered  and  is  never  left,  so  state  2  is  an  "absorbing  state" 
and  state  1  is  a  "transient  state,"  a  state  that  is  never  reentered  once 
left.  Weather  data  modeled  as  in  this  thesis  cannot  have  absorbing  or 
transient  states.  If  annual  (or  monthly  or  daily,  etc.)  precipitation 
is  recorded  instead,  this  problem  is  eliminated  and  the  data  has  the 
desired  properties.  In  this  thesis,  it  will  be  assumed  that  all  varia¬ 
bles  will  either  be  a  measurement  at  a  specific  time  or  a  cumulation 
over  a  specified  length  of  time,  and  such  data  can  be  suitable  for 
stochastic  modeling. 

Characteristics  of  Real  Weather  Variables 

Continuous .  Since  weather  occurs  continuously  in  time  and  space, 
at  least  above  the  molecular  level,  real  weather  data  is  fundamentally 
continuous.  Therefore,  most  real  weather  variables  are  either  contin¬ 
uous  (such  as  temperature)  or  are  integrals  of  continuous  variables  over 
discrete  time  intervals  (such  as  daily  precipitation). 

Deterministic.  Since  weather  is  governed  by  physical  laws,  it  is 
deterministic,  meaning  that  sane  conditions  will  lead  to  the  same  re¬ 
sult.  The  fundamental  laws  include  three  equations  of  motion  (one  in 
each  space  dimension)  and  equations  for  the  conservation  of  mass  and 
energy.  Sometimes  an  equation  for  the  conservation  of  moisture  is 
added.  These  are  nonlinear  partial  differential  equations  and  are  true 
at  any  point  in  the  atmosphere  at  any  given  time.  In  principle,  if  all 
of  the  conditions  at  a  given  time  are  known,  the  weather  conditions  at 
any  following  time  can  be  perfectly  forecasted.  Actually,  these 
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equations,  as  operationally  used,  are  incomplete  since  sane  factors  are 
left  out. 

Dependent .  Weather  data  is  never  truly  independent,  regardless  of 
the  separation  of  time  and  distance.  A  small  change  at  one  location 
will  have  an  eventual  effect  (although  it  may  be  slight)  on  the  weather 
at  every  other  location. 

Multivariate.  The  large  number  of  factors  and  interactions  that 
produce  weather  mean  that  weather  is  a  very  complex  process.  It  is 
multivariate,  with  vector  and  tensor  (or  matrix)  notation  required  to 
describe  the  weather  at  a  point  in  space  and  time. 

Nonlinear.  Weather  is  nonlinear,  as  shown  by  the  form  of  the 
governing  physical  equations,  which  cannot  be  analytically  solved  be¬ 
cause  of  their  nonlinear  form.  Also,  effects  may  not  appear  to  be 
proportional  to  the  cause,  and  interactions  between  weather  elements 
can  amplify  or  damp  out  the  changes  that  would  occur  from  each  weather 
element  separately. 

Scales  of  motion.  Weather  occurs  on  many  scales  of  motion  simul¬ 
taneously.  These  are  generally  divided  into  three  categories  by 
meteorologists:  macroscale  (horizontal  extent  approximately  1000 
kilometers  to  planetary  scale),  mesoscale  (about  10  to  1000  kilometers), 
and  microscale  (under  10  kilometers).  These  are  not  precise  divisions 
between  categories.  Examples  of  generally  microscale  phenomena  are 
turbulence,  tornados,  and  thunderstorms.  Squall  lines,  lake  effect 
snowstorms,  urban  pollution,  and  tropical  storms  are  generally  consid¬ 
ered  mesoscale  phenomena.  Macroscale  phenomena  include  air  masses, 
frontal  systems,  and  planetary  waves  (jet  streams).  Natural  or  nan-made 
geographic  features  such  as  cities,  rivers,  lakes,  mountain  ranges, 
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and  oceans  have  effects  on  the  weather  on  generally  the  same  scale  as 
the  size  of  the  geographic  feature. 

Scales  of  velocity.  The  physical  laws  of  the  atmosphere  contain 
solutions  for  all  three  types  of  air  motions  that  occur  in  the  atmos¬ 
phere.  Only  the  synoptic  (or  meteorological)  motions,  which  describe 
the  movement  of  air  masses,  are  significant  in  the  weather.  Meteoro¬ 
logical  motions  include  velocities  from  zero,  in  calm  air,  to  those 
observed  in  tornados  and  jet  streams.  Gravity  waves,  the  second  type 
of  motions,  arise  on  discontinuities  between  air  masses  and  are  some¬ 
times  visible  in  the  atmosphere  as  a  set  of  parallel  rows  of  clouds. 
Sound  waves  are  the  third  type  of  motions  and  are  also  obtained  as 
solutions  of  the  basic  equations.  Neither  gravity  nor  sound  waves  are 
of  meteorological  significance  although  they  do  occur  in  the  atmosphere. 
They  can  cause  entirely  spurious  results  in  numerical  weather  forecasts, 
so  they  are  removed  from  the  set  of  permissible  solutions  to  the 
governing  physical  equations  by  various  mathematical  filtering  methods. 

Scales  of  time.  For  a  given  phenomenon,  the  time  scale  may  vary 
independently  of  the  scale  of  distance  or  velocity.  For  example,  a  city 
will  have  some  effect  on  the  weather  as  it  develops  over  decades  or 
centuries,  although  its  horizontal  extent  is  only  a  few  miles,  while  a 
low  pressure  area  will  last  only  a  few  days  or  weeks  and  may  cover 
thousands  of  miles. 

In  descending  order,  some  of  these  time  scales  are  as  follows. 

The  causes  given  are  quite  speculative  for  scales  longer  than  a  year. 

Thousands  of  years  or  longer:  ice  ages  and  interglacial  periods. 
These  may  be  caused  by  regular  changes  in  the  earth's  orbit,  such  as 
the  tilt  of  the  earth's  axis,  or  by  changes  in  volcanism. 
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Centuries:  long  climatic  cycles  such  as  the  "Little  Ice  Age"  from 
about  1600  to  1S50.  Causes  could  include  the  longer  cycles  of  heat 
storage  in  the  deep  ocean  layers. 

Years  to  decades:  the  quasi-biennial  oscillation  van  approximately 
26-month  cycle  noticed  mainly  in  the  tropical  stratosphere) ,  drought  and 
wet  cycles,  or  warm  and  cool  periods.  A  cycle  averaging  approximately 
22  years  in  length  is  easily  visible  in  duo  temperature  data  and  in  the 
data  from  many  other  locations.  These  nay  be  caused  by  stable  or  nearly 
stable  oscillations  in  the  earth-atmosphere-ocean  system. 

Annual  or  seasonal:  the  annual  temperature  cycle  and  seasonal 
monsoons.  Abnormal  weather  may  also  occur  on  a  seasonal  scale.  Ex¬ 
amples  are  the  recent  extremely  cold  winters  in  the  central  and  eastern 
United  States,  and  the  hot  summers  of  1975  and  1976  in  western  Europe. 
These  were  caused  by  persistent  upper  air  patterns  on  the  scale  of  a 
few  months. 

Days  to  weeks:  air  mass  movements  and  tropical  and  extra tropical 
storm  systems.  These  phenomena  are  of  major  coixreni  to  forecasters  and 
can  be  forecasted  fairly  well  using  the  physical  equations  described 
previously. 

Hours  to  one  day:  daily  solar  heating  cycles,  thunderstorms, 
squall  lines,  urban  air  pollution  dispersion,  and  icing  occurrences. 

Seconds  to  minutes:  Gusts,  turbulence ,  and  diffusion  of  pollution 
from  a  single  source.  These  are  caused  or  affected  by  such  factors  as 
the  viscosity  of  the  atmosphere  and  the  characteristics  of  flow  over 
rough  surfaces. 

Periodic  components.  Man>  of  the  motions  occurring  on  various 
time  scales  are  periodic.  All  of  these  periodicities  are  irregular  in 
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some  way  because  of  the  number  of  interactions  that  occur  simultane¬ 
ously. 

Inhomogeneous .  A  result  of  the  periodicities  and  differing  scales 
of  motion  is  that  the  probabilities  of  occurrence  of  given  weather 
phenomena  are  not  constant  with  time  or  from  one  location  to  another. 

Not  in  Steady  state.  Weather  has  been  occurring  under  approxi¬ 
mately  the  same  conditions  (such  as  topography  and  ocean  levels) 
throughout  recent  human  history,  but  the  presence  of  long  climatic 
cycles  means  that  probability  distributions  for  weather  events  do  not 
stay  constant  even  when  averaged  over  long  periods.  It  appears  from 
geological  investigations  that  there  is  no  such  thing  as  a  "normal" 
climate  for  the  earth. 

Statistically  distributed.  Every  weather  variable  has  a  probabil¬ 
ity  distribution  or  density  function  of  seme  form,  generally  not  exact¬ 
ly  matching  any  theoretical  distribution  function.  The  meteorological 
theory  is  not  well  enough  developed  to  predict  the  exact  statistical 
form  of  the  distribution.  One  major  task  in  almost  any  weather  investi¬ 
gation  is  to  find  an  appropriate  distribution  function  for  a  given  set 
of  weather  data. 

Characteristics  of  Observed  Weather  Data 

Transformation  of  real  variables.  Ir.  reducing  actual  weather  to 
observed  data,  the  real  weather  variables  ar'e  transformed  in  many  ways, 
both  intentionally  and  unintentionally.  This  section  describes  some 
characteristics  that  are  significant  in  considering  measured  weather 
data.  All  of  the  properties  of  real  weather  variables  still  hold  for 
observed  weather  data,  except  for  the  continuous  and  deterministic 
properties,  which  are  modified. 
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Differing  methods  of  observation.  Some  variables,  such  as  temper¬ 
ature,  wind  speed  and  direction,  or  cloud  cover,  are  observed  directly 
by  an  instrument  or  observer. 

Other  variables  must  be  calculated  because  they  are  not  routinely 
measured,  cannot  be  directly  measured,  or  reflect  applications  of 
measured  variables  to  special  purposes.  Several  examples  can  be  given: 

(a)  Relative  humidity  is  calculated  from  the  air  temperature  and 
wet  bulb  or  dew  point  temperature  observations. 

(b)  Vertical  air  motion  cannot  be  satisfactorily  measured  using 
existing  instruments,  so  equations  have  been  developed  to  calculate 
vertical  velocities  from  other  observed  variables. 

(c)  The  geostrophic  wind  does  not  really  exist  but  is  useful  for 
certain  forecasting  purposes.  It  describes  the  wind  that  would  occur 
in  the  absence  of  friction  as  the  Coriolis  force  acts  on  air  masses  of 
differing  densities. 

(d)  The  number  of  heating  or  cooling  degree  days  has  been  found 
to  be  quite  accurately  proportional  to  the  amount  of  heating  or  cooling 
required,  when  comparing  one  time  period  or  location  with  another.  If 
a  day  averages  65  degrees  Fahrenheit  or  colder,  the  number  of  degrees 
by  which  the  daily  average  temperature  is  lower  than  65F  is  equal  to 
the  number  of  heating  degree  days.  Similarly,  if  the  average  daily 
temperature  is  65F  or  warmer,  the  number  of  degrees  it  is  above  65F  is 
equal  to  the  number  of  cooling  degree  days.  For  example,  a  day  aver¬ 
aging  50F  has  15  heating  degree  days  and  a  day  averaging  70F  has  5 
cooling  degree  days.  The  heating  or  cooling  degree  days  can  be  accumu¬ 
lated  separately  over  a  period  such  as  a  month  or  season  to  give  a 
measure  of  the  energy  needed  to  heat  or  cool  during  the  period.  Of 


course,  a  boating  degree  day  does  not  cancel  out  a  cooling  degree  day. 

Discreteness  and  delays .  Weather  variables  are  observed  in  a  dis¬ 
crete  manner  (1)  in  scales  of  measurement,  due  to  the  limitations  on 
the  precision  or  resolution  in  measurement  and  reporting,  (2)  in  fre¬ 
quency  of  measurement,  due  to  discrete  observing  intervals,  smoothing 
caused  by  response  lag  times  of  instruments,  or  the  need  to  accumulate 
data  over  a  period  of  time,  (3)  by  observing  at  discrete  locations,  and 
(4)  by  recording  occurrence  or  nonoccurrence  of  a  phenomenon.  For  some 
statistical  purposes,  the  data  can  be  considered  continuous,  but  it  is 
actually  discrete,  and  all  statistical  techniques  used  in  this  thesis 
will  be  designed  for  handling  discrete  data.  Note  that  discreteness 
in  time  means  that  there  are  delays  in  data  availability. 

Errors .  Errors  and  discontinuities  are  introduced  Into  weather 
data  in  many  ways. 

(1)  Errors  in  measurement  usually  do  not  cause  a  systemat ic  bias. 
They  nvay  often  be  reduced  by  averaging. 

12)  Errors  in  calibration  will  not  generally  ho  t 'educed  by 
averaging  because  of  a  systematic  bias  that  nvay  be  present  in  the  data. 
Averaging  may  help  to  detect  this  problem.  For  example,  the  duration 
of  sunlight  is  registered  by  an  instrument  that  responds  to  a  certain 
level  of  illumination,  so  it  is  highly  sensitive  to  calibration.  From 
1968  to  1978,  Columbus,  Ohio  averaged  o.c  percent  less  of  the  possible 
amount  of  sunshine  than  Dayton,  Ohio,  while  their  daytime  cloudiness 
averaged  only  1.4  percent  of  the  possible  amount  more  titan  in  Dayton. 
From  1957  to  h>b7,  Columbus  had  3.1  percent  loss  sunshine  and  .8  percent 
more  daytime  cloudiness  than  Dayton.  This  data  was  compiled  from 
Local  Climatological  Data  issues  from  1957  to  1978  for  Columbus  and 


Dayton,  Ohio  (U.S.  Department  of  Commerce,  1957  to  19781.  Hie  large 
difference  in  sunshine  between  Dayton  and  Columbus  from  1968  to  1078 
appears  to  be  due  more  to  differences  in  calibration  than  to  actual 
physical  differences. 

(3)  Elements  of  weather  that  are  not  instrumentally  measured  but 
are  recorded  by  a  human  observer  (such  as  type  of  precipitation  or 
other  weather,  amount  and  type  of  cloud  cover,  and  snow  depth'  intro¬ 
duce  human  biases  and  differences  due  to  levels  of  training  or  experi¬ 
ence. 

(A)  Changes  in  equipment  can  cause  discontinuities  similar  to 
calibration  errors.  For  example,  a  hygrcthermometer,  an  electronic 
device  to  measure  atmospheric  temperature  and  moisture,  will  not  respond 
in  the  same  way  as  a  mercury-in-glass  thermometer.  Most  Weather 
Bureau  (now  National  Weather  Service)  and  Air  Weather  Service  stations 
replaced  mercury-in-glass  thermometers  with  hygrothermometers  in  the 
late  1950’s. 

(5)  Changes  in  station  location  cause  a  significant  discontinuity. 
Until  the  1930’s,  almost  all  professionally  observed  weather  data  was 
obtained  in  cities.  Since  the  1930’s,  the  official  source  of  weather 
data  for  most  cities  has  been  the  airport,  usually  in  a  somewhat  rural 
location.  Even  when  a  weather  station  stays  at  the  same  location  in  a 
city,  the  growth  of  the  city  changes  the  weather  gradually. 

(6)  Administrative  policies  and  changes  in  observing  procedures 
or  programs  may  introduce  greater  discontinuities  than  normally  recog¬ 
nized.  An  example  in  the  Air  Force  has  been  the  elimination  of  the 
ROS  (Remote  Observing  Site)  as  the  source  of  weather  observations  at 
most  bases.  At  some  bases,  the  observer  must  walk  all  the  way  around 
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the  operations  building  if  he  wants  to  see  the  whole  sky  when  making  a 
weather  observation. 

(7)  Changes  in  public  awareness  have  caused  the  number  of  reported 
tornado  occurrences  to  increase  greatly  throughout  this  century.  As 
weather  effects  on  aircraft  became  more  fully  known,  procedures  for 
reporting  significant  weather  were  refined  because  of  the  need  for  a 
more  complete  data  base. 

Not  complete.  Since  weather  cannot  be  observed  completely,  only 
a  limited  number  of  variables  are  observed,  and  there  are  errors  in  the 
available  data,  observed  weather  data  is  not  complete.  There  are  sever¬ 
al  consequences  of  this :  ( 1 )  Observed  weather  data  is  not  completely 
deterministic,  but  can  be  considered  to  be  a  mixture  of  deterministic 
and  random  components.  12)  A  missing  observation  cannot  be  recon¬ 
structed  exactly.  (3)  It  is  not  possible  to  perfectly  forecast  the 
weather,  regardless  of  the  number  of  available  weather  observations . 

(4)  The  correlation  between  weather  observations  in  space  or  time  is 
not  complete,  although  complete  statistical  independence  usually  cannot 
be  assumed  either. 

Usable  with  caution.  In  spite  of  the  difficulties  mentioned, 
there  is  actually  a  vast  quantity  of  accessible  weather  data  of  reason¬ 
ably  good  quality.  Generally,  the  errors  are  not  large  enough  to 
significantly  affect  results  if  the  user  is  aware  of  the  data  deficien¬ 
cies.  The  existing  data  is  usually  suitable  for  most  projects,  since  it 
is  often  possible  to  calculate  new  variables  from  the  available  data, 
and  reconstruct  the  value  of  the  variable  into  the  past  to  produce  a 
long  data  series.  For  example,  heating  and  cooling  degree  days  are 
calculated  from  daily  temperatures,  so  that  the  heating  and  cooling  days 
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can  be  calculated  for  many  locations  back  to  the  1800's  even  though  the 
concept  of  heating  or  cooling  degree  days  was  not  introduced  until  about 
1915. 

Properties  of  Observed  Weather  Data  That  Will  Be  Useful  for  Simulation 

Random  components.  For  simulation  purposes ,  weather  can  be  con¬ 
sidered  to  have  random  components,  because  some  factors  are  unmeasured. 
The  limitations  in,  and  uncertainties  of,  existing  weather  observations 
allow  the  data  to  be  suitable  for  modeling  as  a  stochastic  process. 

In  many  cases,  it  is  appropriate  to  remove  some  or  all  of  the 
nonrandom  components  from  the  raw  data,  as  will  be  discussed  in  the 
next  section.  Data  analysis  would  be  performed  on  the  residuals.  All 
of  the  properties  described  in  this  section  will  be  at  least  as  valid 
for  the  residuals  as  for  the  unmodified  data. 

A  finite  range.  Even  though  there  is  almost  no  theoretical  limit 
to  the  range  of  many  variables,  such  as  the  amount  of  precipitation  that 
can  fall,  they  are  observed  to  occur  in  relatively  limited  ranges  of 
values . 

Discrete  states.  It  is  not  possible  to  measure  the  weather  with 
infinite  precision.  In  general,  the  number  of  different  values  of  a 
measurable  meteorological  variable  is  fairly  small,  although  there  may 
be  a  few  hundred  different  values.  Often,  it  is  useful  to  combine  these 
values  into  a  smaller  number  of  mutually  exclusive  ranges  of  values, 
which  are  called  discrete  states. 

Recurrent .  Recurrence  means  that  any  state  can  be  obtained  at 
any,  or  almost  any,  future  time.  There  are  no  "absorbing  states”  or 
"transient  states"  (defined  at  the  beginning  of  this  section) .  Even 
if  a  certain  value  or  range  of  values  of  a  variable  is  not  expected  to 
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occur  at  certain  times  and  is  likely  at  the  same  location  at  other 
times  (such  as  a  temperature  below  freezing  in  the  central  United 
States,  in  July  compared  with  January),  it  is  often  possible  to  remove 
nonrandom  components  from  the  data  so  that  the  states  determined  from 
the  remaining  components  can  be  considered  completely,  or  nearly  com¬ 
pletely,  recurrent. 

An  exception  to  this  property  is  a  variable  which  is  physically 
limited  to  one  value  at  certain  fixed  times  (such  as  sunshine  intensity 
when  the  sun  is  below  the  horizon) ,  or  is  so  unlikely  to  deviate  from 
a  specified  value  at  certain  times  that  it  may  be  considered  to  be  fixed 
(such  as  the  number  of  days  with  frost  in  Alabama  in  July).  Possible 
procedures  for  handling  such  cases  are  discussed  in  the  next  section. 

Aperiodic.  There  is  some  periodic  component  in  almost  any  weather 
data,  such  as  a  daily  or  annual  cycle,  but  there  is  a  significant 
variable  component  superimposed  on  this,  which  may  be  considered  a  ran¬ 
dom  (nonperiodic)  component  . 

Homogeneous  and  in  Steady  State.  Over  a  limited  time  or  distance, 
probability  distributions  can  be  considered  constant,  In  some  cases 
(as  in  the  preceding  two  paragraphs),  removal  of  cyclic  components  may 
lead  to  constant  probability  distributions.  Another  consequence  of  the 
assumption  of  a  steady  state  is  that  it  is  assumed  that  there  is  no 
significant  unilateral  trend  in  the  data  from  one  cycle  to  another,  at 
least  for  the  period  of  interest.  If  the  removal  of  nonrandom  com¬ 
ponents  does  not  give  reasonable  results,  a  limited  kind  of  steady 
state  can  be  achieved  by  a  method  discussed  in  the  next  section. 
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Preliminary  Data  Analysis 

Should  deterministic  components  be  removed  from  the  data?  If 
observed  data  were  completely  deterministic,  it  would  not  be  necessary 
to  model  the  data  by  any  kind  of  random  process.  The  exact  values  of 
any  desired  weather  variable  at  any  future  time  in  any  location  would 
be  specified  by  appropriate  equations.  However,  observed  weather  data 
is  considered  to  be  a  mixture  of  deterministic  (predictable)  and  randan 
components.  In  many  cases,  it  is  useful  to  remove  the  deterministic 
components  from  the  data  and  perform  statistical  analyses  on  the  re¬ 
maining  (random)  components. 

The  first  task  is  to  determine  if  removing  the  deterministic  com¬ 
ponents  from  the  data  will  produce  reasonable  results.  Occasionally , 
it  may  be  best  to  remove  some  of  the  deterministic  components  but  not 
attempt  to  convert  the  residuals  to  a  constant  probability  distribu¬ 
tion. 

(1)  If  there  is  a  procedure  that  will  leave  residuals  which  can 
be  considered  to  be  in  a  steady  state  (having  a  constant  probability 
distribution),  it  will  be  generally  easier  to  model  the  residuals  rather 
than  the  unmodified  data  by  a  stochastic  model.  An  example  of  this  is 
temperature  data.  In  many  cases,  the  temperature  can  be  standardized 

to  tlie  number  of  standard  deviations  from  the  average  for  the  time  of 
year  (or  day),  and  these  transformed  data  values  can  be  considered  to 
have  a  constant  (usually  normal)  probability  distribution. 

(2)  If  no  procedure  will  produce  an  exact  or  nearly  exact 
probability  distribution  of  the  residuals,  a  suitable  transformation 
may  still  produce  results  that  will  be  sufficiently  accurate  to  be 
useful.  For  example,  assume  that  monthly  precipitation  data  is  to  be 
analyzed  in  an  area  with  moderate  or  heavy  precipitation,  with  very 
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few  long  dry  periods.  If  the  data  is  standardized  by  conversion  to 
the  percentage  of  average  precipitation  for  the  time  of  year,  the 
probability  distribution  may  vary  somewhat,  but  the  variations  may  be 
small  enough  to  be  ignored  in  a  simulation. 

An  example  of  this  could  be  monthly  precipitation  for  Columbus, 
Ohio,  at  Ohio  State  University.  Normals  are  given  in  U.S.  Department 
of  Commerce,  Weather  Bureau  (1964)  and  percentile  values  for  monthly 
precipitation  are  given  by  Miller  and  Weaver  (1969).  Based  on  this 
data,  a  monthly  precipitation  at  least  50  percent  above  normal  occurs 
from  about  12  percent  of  the  time  in  February  to  20  percent  of  the  time 
in  March.  Precipitation  that  is  half  or  less  of  normal  occurs  from 
S  percent  of  the  time  in  April  to  28  percent  of  the  time  in  October. 

If  substantial  accuracy  in  predicting  extreme  events  is  not  needed, 
these  differences  might  be  overlooked.  An  average  distribution,  gen¬ 
erating  monthly  precipitation  values  more  than  50  percent  above  normal 
about  17  percent  of  the  time  and  values  more  than  50  percent  below 
normal  about  18  percent  of  the  time,  could  be  used. 

(3)  If  there  are  relatively  large  or  sudden  variations  in  proba¬ 
bility  distributions  throughout  the  year  (or  day,  etc.),  there  may  be 
no  advantage  produced  by  removing  nonrandom  components  from  the  data. 
For  example,  in  a  fairly  dry  area  with  a  definite  rainy  season,  such 
as  Arizona,  the  average  precipitation  for  different  months  may  vary 
(possibly)  from  .2  to  2  inches.  For  the  month  averaging  .2  inches, 
probably  in  many  years  there  is  no  measurable  precipitation  at  all 
(0  percent  of  normal),  while  2  inches  of  precipitation  (1000  percent 
of  normal)  will  probably  occur  occasionally.  For  the  month  averaging 
2  inches,  either  20  inches  or  no  precipitation  during  the  month  would 


probably  never  be  recorded,  so  the  probability  distributions  of 
percentage  of  normal  would  vary  greatly  throughout  the  year.  In  a  case 
such  as  this,  expressing  the  data  in  terms  of  percentiles  may  give  more 
realistic  results,  but  special  procedures  (as  described  later  in  this 
chapter)  should  be  used  if  a  variable  is  restricted  to  a  single  value 
at  certain  times.  For  example,  snowfall  may  not  occur  at  a  location 
at  all  during  some  months  of  the  year,  so  a  percentile  value  would  be 
meaningless  during  these  months. 

(4)  In  some  cases,  there  are  physical  limits  on  the  data  values, 
as  above,  either  on  one  end  or  both  ends  of  the  permissible  range. 

For  example,  precipitation  and  wind  speed  3re  restricted  to  nonnegative 
values,  and  the  proportion  of  the  sky  covered  with  clouds  must  be  in 
the  range  from  0  to  1  (0  to  10  tenths).  Sunshine  intensity  is  zero 
when  the  sun  is  below  the  horizon.  If  there  is  a  physical  limit,  and 
especially  if  one  or  the  other  of  the  limiting  values  occurs  occasion¬ 
ally  or  frequently  (such  as  months  with  no  precipitation,  when  analyzing 
monthly  precipitation  data) ,  it  is  usually  best  to  work  with  the 
original  data  without  removing  the  nonrandom  components. 

Techniques  to  use  if  nonrandom  components  can  be  removed .  If  it 
is  reasonable  to  remove  the  deterministic  components  from  a  data  series, 
analysis  and  simulation  is  a  four-step  process:  (1)  Separate  the 
deterministic  components  from  the  data  series  and  develop  an  equation 
or  procedure  to  specify  the  values  of  the  deterministic  components. 

(2)  Develop  a  statistical  description  of  the  residuals,  or  randcm 
components.  (3)  Simulate  a  series  of  random  components  using  the 
statistical  description.  (4)  Superimpose  a  series  of  deterministic 
values  onto  the  series  of  randcm  values  to  produce  a  final  simulated 


data  series. 


No  complete  discussion  of  data  transformation  techniques  can  be 
given  here,  but  some  possible  techniques  can  be  named.  A  very  basic 
procedure  is  to  standardize  data  to  the  number  of  standard  deviations 
from  the  average  value  for  the  time  of  year  (or  day,  etc.),  or  to  par¬ 
tially  standardize  by  using  the  percentage  of  the  average  value.  Spec¬ 
tral  analysis  may  be  used  to  develop  a  smoothed  curve  of  "normal" 
values  throughout  the  cycle.  In  seme  cases  where  the  data  is  restricted 
to  nonnegative  values,  it  may  be  useful  to  work  with  the  logarithms  of 
the  data  values  rather  than  the  data  itself  before  standardization  (with 
a  small  value  added  to  each  data  point  before  taking  the  logarithm  if 
there  are  any  zero  values,  as  suggested  by  Srikanthan  and  McMahon  (1968). 
Another  data  transformation  technique  is  to  convert  the  data  values 
into  percentiles,  unless  a  lot  of  the  data  values  are  the  same.  (For 
example,  if  50  percent  of  the  data  values  are  zero,  there  is  no  dis¬ 
tinction  between  the  10th  and  40th  percentiles).  Basically,  meteoro¬ 
logical  reasoning  must  be  used  to  determine  an  appropriate  transforma¬ 
tion. 

The  preceding  techniques  are  intended  to  be  used  with  variables 
that  have  a  large  nunber  of  possible  values,  such  as  monthly  precipita¬ 
tion  measured  in  units  of  .01  inch.  If  it  is  desired  to  reduce  the 
number  of  possible  values  to  a  small  number  of  states,  the  basic  data 
transformation  should  be  performed  first.  If  the  data  is  already  in 
the  form  of  a  small  nunber  of  different  states,  probably  no  data  trans¬ 
formation  should  be  performed. 

Techniques  to  be  used  if  nonrandom  components  should  not  be 
removed.  When  realistic  results  will  not  be  achieved  by  using  a  trans¬ 
formation  such  as  those  suggested  above,  or  if  the  precision  gained 
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is  not  necessary,  or  if  some  deterministic  components  are  removed  but 
the  residuals  do  not  have  a  constant  probability  distribution,  it  may 
be  reasonable  to  simply  ignore  the  lack  of  a  steady  probability  distri¬ 
bution.  For  example,  Yost  and  Aronson  (1977)  assumed  that  the  average 
cloud  cover  in  Albuquerque  is  the  same  throughout  the  year,  even  though 
the  data  shows  that  the  average  cloud  cover  varies  from  less  than  2  to 
over  5  tenths  at  different  times  of  the  year,  with  an  especially  rapid 
rise  in  average  cloudiness  in  early  July. 

If  the  differences  in  probabilities  should  not  be  ignored,  the 
most  common  technique  is  to  use  different  probability  distributions 
for  different  times  of  the  year  (or  day,  etc.).  These  subintervals 
should  be  chosen  so  that  the  probabilities  within  each  subinterval  can 
be  considered  constant.  For  example,  Haan,  Allen,  and  Street  (1975) 
used  separate  precipitation  probability  distributions  for  each  month  of 
the  year  in  their  simulation.  More  data  would  be  required,  of  course, 
to  give  the  same  confidence  for  each  probability  distribution  as  when 
considering  the  data  as  a  whole. 

Another  possible  technique  is  used  by  Feyerhemi  and  Bark  (1965). 
They  hypothesized  that  the  probabilities  involved  in  their  data  vary 
smoothly  throughout  the  year,  so  they  used  spectral  analysis  to  develop 
a  smoothed  probability  cycle  throughout  the  year  for  each  independent 
probability.  If  there  is  a  reasonable  basis  for  making  this  hypothesis, 
this  method  could  be  considered  a  "variance  reduction  technique,"  to 
lead  to  both  an  accurate  reproduction  of  the  underlying  trend  and  high 
confidence  in  the  validity  of  the  statistically-derived  cycle. 

In  some  cases,  as  mentioned  previously,  a  variable  is  restricted 
to  a  single  value,  or  is  almost  completely  limited  to  a  single  value, 
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at  some  fixed  times  (such  as  sunshine  intensity  when  the  sun  is  below 
the  horizon) .  It  is  best  to  simply  specify  the  value  of  the  variable 
at  these  times.  If  the  times  involved  are  somewhat  variable,  such  as 
the  season  of  no  snowfall  in  most  mid- latitude  continental  locations, 
random  procedures  may  be  appropriate  to  determine  the  starting  and 
ending  dates  or  times. 
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III.  Theory  of  Markov  and  Semi-Markov  Processes 

Characteristics  of  Stochastic  Processes 

A  stochastic  process  is  used  to  statistically  describe  the  opera¬ 
tion  of  a  system  that  behaves  as  follows: 

(1)  The  system  condition  is  observed  at  specified  values  of  an 
independent  variable.  This  variable  is  usually  time  (meaning  that  the 
system  is  observed  at  various  times),  but  one  objective  of  this  thesis 
is  to  observe  systems  where  the  independent  variable  consists  of  points 
in  space,  and  time  is  held  constant  at  each  observation  of  the  system. 
Only  discrete,  equally-separated  values  of  the  independent  variable  will 
be  considered  here,  and  their  values  will  usually  be  denoted  by 
sequences  of  the  nonnegative  integers. 

In  higher-dimensional  cases,  a  vector  independent  variable  will  be 
used.  If  a  weather  condition  is  measured  over  an  area,  the  independent 
variable  will  be  expressed  as  a  two-element  vector,  as  in  a  Cartesian 
coordinate  system.  This  can  be  extended  to  higher  dimensions,  and  time 
can  be  one  of  the  dimensions. 

(2)  For  each  specified  value  of  the  independent  variable  (a  given 
time,  or  a  given  point  in  space),  the  system  is  observed  to  be  in  ex¬ 
actly  one  of  a  number  of  states.  The  observed  state  can  be  called  an 
outcome .  In  this  thesis,  the  number  of  different  states  is  generally 
considered  to  be  finite. 

(3)  The  state  of  the  system  is  a  random  variable,  meaning  that 
the  probabilities  may  be  described  by  a  probability  vector,  which  has 
one  element  for  each  state  and  whose  elements  are  the  probabilities 
associated  with  each  state.  Since  the  processes  defining  the  probabil¬ 
ities  are  not  limited  to  any  particular  form,  any  vector  whose  components 
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are  nonnegative  and  sun  to  1  can  be  called  a  probability  vector  (Miller, 
1977,  page  7-3). 

One-Dimensional  Markov  Processes 

A  general  Markov  process  is  a  stochastic  process  in  which  the 
probability  distribution  at  any  observation  depends  only  on  a  specified 
set  of  other  actual  observations.  If  these  actual  observations  are 
known,  knowledge  of  additional  data  values  will  not  give  any  more  infor¬ 
mation  about  the  probability  distribution.  In  particular,  it  is  not 
necessary  to  know  all  of  the  other  data  values  in  a  series  to  specify 
the  probability  distribution  at  an  observation. 

Most  discussions  of  Markov  processes  are  confined  to  one-dimen¬ 
sional  processes,  where  the  independent  variable  is  a  scalar,  such  as 
time.  Assume  that  a  system  is  observed  at  equidistant  time  intervals 
and  that  the  system  is  observed  to  be  in  exactly  one  of  N  different 
states  at  each  time.  Then,  in  a  first-order  Markov  process,  the 
probability  distribution  of  the  state  at  any  observation  depends  only 
on  the  value  at  the  immediately  preceding  observation.  More  generally, 
a  Markov  chain  of  order  r  assumes  that  the  probability  distribution  at 
each  observation  depends  only  on  the  last  r  available  observations. 

In  a  first-order  Markov  chain,  where  i  and  j  represent  states 
(i  :  1,  .  .  ,  N;  j  :  1,  .  .  ,  N),  let  p^  be  the  transition  probability, 
or  probability  that  a  system  which  is  in  state  i  at  one  observation 
will  be  in  state  j  at  the  next  observation.  The  set  of  transition 
probabilities  for  all  i  and  j  combinations  can  be  expressed  as  an  N  x  N 
transition  probability  matrix  (TPM),  which  will  be  normally  denoted  by 
P.  P  is  a  stochastic  matrix,  a  matrix  whose  rows  are  stochastic 
vectors. 
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If  i  i  J,  a  real  transition  occurs,  meaning  that  the  system 
enters  a  new  state.  If  i  =  J,  a  virtual  transition  occurs,  meaning  that 
the  system  remains  in  the  same  state.  This  distinction  will  only  be¬ 
come  significant  when  discussing  semi-Markov  processes. 

For  the  following  discussion,  the  properties  of  observed  weather 
data  that  were  discussed  in  the  last  section  of  chapter  II  will  be 
assumed  to  hold.  Also,  each  state  will  be  considered  to  be  recurrent 
(observable  at  any  or  almost  any  time  or  location)  and  aperiodic  (not 
occurring  at  absolutely  fixed  intervals),  and  the  stochastic  process 
will  be  considered  to  be  in  a  steady  state  (the  probability  vector  will 
be  considered  constant  over  the  values  of  the  independent  variable). 

When  each  state  is  recurrent,  and  aperiodic,  each  state  is  called  ergodic 
A  more  extensive  discussion  of  each  of  these  terms  is  given  in  standard 
textbooks  about  stochastic  processes,  such  as  Feller  ( 19t>S,  Chapter  XV), 
and  these  conditions  will  define  a  certain  type  of  Markov  process. 

Let  3 ( t )  be  the  state  observed  at  time  t.  Let  V ( t )  =  (v, .  v„0  .  . 
vtN)  be  the  vector  of  unconditional  probabilities  for  each  state  at 
time  t.  Then,  the  basic  definition  of  a  first-order  Markov  chain  re¬ 
quires  that 

P  s(t)  =  J  |  s ( t- 1 )  =  i,  s(t-2)  -  h,  s(t-3)  =  g,  .  .  j 
=  P  s(t)  =  J  |  s(t-l)  =  i  |  =  Pi j . 

Feller  (1963,  chapter  XV)  shows,  using  slightly  different  nota¬ 
tion,  that 

V(l)  =  V(0)P  and  V(t)P  =  V(0)P\ 
where  V ( 0 )  is  the  initial  unconditional  probability  distribution.  As 
t  approaches  infinity,  V ( t )  approaches  a  limiting  vector  called 
V  =  (VjV.,  .  .  .  v^)  if  the  Markov  chain  is  ergodic,  which  is  true  for 


most  weather  variables.  This  limiting  vector  is  the  steady-state  distri¬ 
bution  and  does  not  depend  on  VtO).  Since  weather  has  been  occurring 
long  enougii  that  it  can  be  considered  to  be  in  a  steady  state  tat  least 
during  the  available  period  of  record),  this  limiting  distribution 
should  be  equal  to  the  observed  unconditional  probability  distribution. 

In  general,  the  value  of  V  is  obtained  from  a  transition  probability 
matrix  by  solving  V  =  VP,  using  the  fact  that  the  sum  of  all  of  the 
elements  of  V  is  1  to  obtain  a  unique  vector. 

As  described  earlier  in  this  section,  higher-order  dependence  can 
be  treated  using  Markov  concepts.  The  most  frequently-used  notation  is 
called  "expanded  state  representation."  For  a  chain  of  order  r,  the 
state  at  ait  observation  is  denoted  by  an  r-element  vector  consisting 
of  the  r  most  recent  observations,  in  sequence.  For  two  consecutive 
observations,  the  last  r- 1  elements  of  the  state  vector  for  the  first 
observation  must  be  identical  to  the  first  r-1  elements  of  the  state 
vector  for  the  second  observation.  For  example,  assume  there  are  thtvo 
states  denoted  by  0,  1,  and  D.  A  second-order  vector  state  of  u'.t' 
denotes  a  present  observation  of  state  1,  with  state  J  at  the  preceding 
observation.  At  the  next  observation,  the  only  possible  vector  states 
are  1 1,0),  (1,1),  and  ( 1,2) . 

Using  this  notation  for  the  states,  an  appropriate  notation  for 
the  probability  of  a  transition  from  state  (i,J)  to  state  tj,k)  is 
^ilk’  *?or  a  ^haln  of  order  r,  the  probability  of  transition  from  state 
ii,J,  .  .,  k)  to  state  (J,k,  .  .  , 1)  is  p.  .  ,  ,,  where  each  state 

lj  »  %  K.  I 

vector  has  r  elements  and  the  probability  has  r+t  subscripts.  The 
transition  probability  matrix  is  N1  x  in  sice,  wit h  ,*ero  probabil¬ 
ities  for  impossible  transitions.  For  example,  3-state  first-  and 


second-order  transition  probability  matrices  are  shown  in  figure  2,  with 
probabilities  that  may  be  nonzero  indicated  by  *.  Most  mathematical 
manipulation  is  the  same  as  for  first-order  matrices.  For  example, 

V  =  VP  will  give  the  limiting  value  of  the  unconditional  probability 
distribution  of  each  expanded  state. 
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Figure  2.  Transition  Probability  Matrices  for  First-  and 
Second-Order  3-State  Markov  Chain 


Semi -Markov  Processes 

Basic  theory  and  practice.  When  discrete  observation  times  are 
considered,  a  semi -Markov  process  differs  from  a  Markov  process  basi¬ 
cally  only  in  allowing  the  times  between  transitions  to  be  random  varia¬ 
bles  of  some  arbitrary  distribution.  Merrill  ( 197M  gives  an  extensive 
discussion  of  the  theory  and  use  of  semi-Markov  processes.  For  a 
transition  from  state  i  to  state  J,  there  is  a  transition  probability 
Pj^  and  also  a  holding  time  vector  Each  element  of  the  holding 

time  vector,  denoted  by  hjj(t),  with  t  =  1,  2,  .  .  .  ,  is  the 
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probability  that  the  system  will  remain  in  state  i  for  t  time  units, 
given  that  the  system  will  next  enter  state  j.  Therefore,  the  vector 
is  a  stochastic  vector.  In  most  operational  cases,  t  is  not  allowed 
to  exceed  a  certain  value  T.  Then,  the  complete  holding  time  matrix  H 
is  a  three-dimensional  matrix  with  dimensions  N  x  N  x  T.  For  basic 
operational  use,  beginning  with  a  starting  time  immediately  after  a 
transition,  and  with  the  system  in  state  i,  the  transition  probability 
matrix  is  used  to  select  the  next  state  j.  After  this,  the  holding  time 
matrix  is  used  to  select  the  holding  time  in  state  i. 

A  major  advantage  of  a  semi-Markov  process  is  that  it  may  reduce 
the  number  of  different  probabilities  that  must  be  estimated  from  the 
data,  although  the  number  of  parameters  needed  is  still  likely  to  be 
fairly  large.  For  example,  if  it  is  found  that  a  high-order  Markov 
model,  such  as  fifth  order,  is  needed,  this  may  simply  mean  that  the 
probability  that  a  state  will  persist  depends  mainly  on  how  long  the 
state  has  already  persisted,  and  that  it  is  not  unusual  for  a  state  to 
persist  considerably  longer  than  five  time  intervals.  A  first-order 
semi-Markov  model  allowing  holding  times  up  to  T  =  24  hours  may  describe 
the  data  at  least  as  well.  If  there  are  five  states,  a  fifth-order 
Markov  model  will  require  56  =  15625  different  probabilities  to  be 
specified.  Since  each  probability  vector,  with  N  =  5  elements,  must 
add  up  to  1,  there  are  N-1  =  4  independent  probabilities  in  each  vec¬ 
tor,  and  the  total  number  of  independent  probabilities  is  5^  x  4  =  12500. 

2 

A  first-order  semi -Markov  model,  with  T  =  24,  requires  5  =  25  proba- 

2 

bilities  for  matrix  P  and  5  x  24  =  600  probabilities  for  the  holding 
time  matrix  H,  or  a  total  of  625  probabilities.  The  number  of  indepen¬ 
dent  probabilities  involved  in  this  case  is  480,  which  will  be  explained 
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under  alternative  (2!  below. 


There  are  several  alternatives  or  options  in  the  operational  use 
of  a  semi -Markov  process. 

(1)  Allowing  virtual  transitions  is  optional.  If  virtual  trans¬ 
itions  are  allowed,  there  is  more  than  one  way  to  obtain  a  given  holding 
time  in  a  state.  The  distinction  between  a  three-hour  holding  time, 
compared  to  a  holding  time  of  one  hour  in  a  state,  followed  by  a  holding 
time  of  two  hours  in  the  same  state,  is  meaningless.  To  obtain  holding 
time  distributions  from  a  set  of  actual  data,  it  is  best  to  consider 
only  real  transitions,  except  when  the  holding  time  exceeds  T,  as  dis¬ 
cussed  in  the  next  paragraph. 

(2)  The  decision  tc  limit  the  maximum  holding  time  to  a  particular 
value,  T,  to  control  the  size  of  the  holding  time  matrix,  should  be 
made  from  the  data.  There  should  be  an  insignificant  probability  of  a 
holding  time  greater  than  T,  if  the  state  is  not  allowed  to  "renew" 
itself  after  it  has  persisted  for  T  time  intervals.  To  approximate  the 
distribution  of  holding  times  greater  than  T,  one  possible  approach  is 
to  allow  virtual  transitions  with  a  probability  p^  which  is  equal  to 
the  probability  of  state  i  persisting  longer  than  T  time  units.  Tins 
approximates  the  probability  of  a  holding  time  of  T  t  time  units, 
given  that  the  state  has  persisted  T  time  units,  by  the  probability  of 

a  holding  time  of  t  units. 

In  this  case,  h^it)  =  0  if  t  t  T  and  h^T)  =  1.  To  calculate 
the  number  of  independent  probabilities,  each  vector  in  P  has  N-1  =  A 
independent  probabilities  (considering  the  five-state  example  used  so 
far  in  this  chapter) ,  and  each  vector  H1 ,  has  T-1  =  23  independent 
probabilities.  However,  since  is  exactly  specified,  it  has  no 


independent  probabilities.  Therefore,  the  number  of  independent  proba¬ 
bilities  is  5  x  4  (for  P)  +5x4  x  23  (for  H,  excluding  the  Hu  vectors) 
=  20  +  460  =  480. 

(3)  A  large  amount  of  computer  storage  space  could  be  saved  by 
approximating  each  H^j  distribution  by  a  formal  distribution,  so  only 
the  parameters  of  the  distribution  would  need  to  be  stored.  For  ex¬ 
ample,  assume  that  can  be  approximated  by  a  normal  distribution. 

Then,  only  the  mean  and  standard  deviation  would  need  to  be  stored  for 
each  (i,j)  combination.  For  the  five-state  semi-Markov  example  given 
above,  the  total  number  of  parameters  to  store  would  be  25  (for  P)  +  5" 
x  2  (for  H)  =  25  50  =  75,  and  the  number  of  independent  parameters  is 

20  (for  P)  +  50  (for  H,  since  none  of  the  parameters  are  independent)  = 

70.  Appropriate  statistical  tests  would  need  to  be  performed  to  justify 
the  choice  of  a  distribution. 

Equivalence  of  discrete  semi -Markov  and  Markov  processes .  As  shown 
by  Merrill  (1974),  a  discrete  semi-Markov  process  can  be  expressed  by 
an  expanded  state  discrete  Markov  process,  so  the  theory  of  Markov 
processes  can  be  shown  to  apply  to  discrete  semi -Markov  processes. 

If  the  system  condition  at  each  observation  is  denoted  by  a  two- 
element  vector  (i,t)  containing  the  state  i  and  the  time  t  that  the 
state  has  persisted,  all  possible  combinations  of  states  and  persistence 
times  can  be  specified  uniquely.  As  in  the  expanded  state  representa¬ 
tion  of  a  higher-order  Markov  process,  only  certain  transitions  are 
possible.  For  example,  if  state  (i,t)  is  observed  at  one  time,  at  the 
next  hour  either  the  same  state  has  persisted  one  more  hour,  denoted 
by  state  (i,  t+1);  or  a  new  state  J  is  observed,  and  it  has  persisted 
up  to  one  hour,  denoted  by  (j,1).  With  this  notation,  it  is  still 


36 


possible  to  limit  the  maximum  time  to  a  value  T  if  desired.  Then, 
state  (i,T)  is  followed  by  either  (i,1)  or  (j,1),  and  there  is  a  total 
of  NT  possible  states. 


With  this  notation,  a  transition  probability  matrix  can  be  pre¬ 
pared  as  for  a  higher-order  Markov  process,  and  its  dimensions  will  be 
NT  x  NT.  However,  for  each  state,  there  are  only  N  possible  transi¬ 
tions  (one  transition  from  (i,t)  to  (i,  t+1)  and  N-1  transitions  from 

(i,t)  to  ( j ,  1 )),  so  the  number  of  probabilities  that  must  be  found  is 
2 

N  T.  This  expanded-state  representation  incorporates  both  the  P  and  H 
matrices  into  one  matrix,  so  the  number  of  independent  probabilities 
is  the  same  using  either  representation.  In  the  five-state  semi-Markov 
example,  each  state  probability  vector  has  N-1  =  A  independent  elements, 
and  there  are  NT  =  120  states,  so  the  number  of  independent  probabili¬ 
ties  is  A80. 

A  semi -Markov  concept  is  more  appropriate  than  an  expanded-state 
Markov  concept,  as  a  way  of  thinking  about  the  problem  considered  in 
Merrill's  thesis,  but  a  semi-Markov  process  appears  to  be  fundamentally 
a  one-dimensional  process.  Also,  since  the  argument  given  in  this 
section  shows  that  a  discrete  semi-Markov  process  can  be  equivalently 
expressed  as  a  Markov  process,  semi-Markov  processes  will  not  be  a 
further  major  concern  of  this  thesis. 

Estimation  and  Statistical  Testing 

Introduction.  In  justifying  the  fact  that  applying  a  Markov-type 
process  to  a  specific  set  of  data  is  a  useful  way  of  describing  some 
of  the  statistical  characteristics  of  the  data,  there  are  two  issues 
that  need  to  be  considered.  The  problem  of  estimation  discusses  how 
to  obtain  estimates  of  transition  probabilities  and  other  parameters 
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from  a  set  of  observed  data.  Statistical  testing  concerns  the 
legitimacy  of  the  results,  and  helps  answer  two  questions:  Does  a 
certain  hypothesized  Markov  chain  reproduce  the  desired  statistical 
characteristics  of  the  data  well?  Is  one  Markov  process  better  than 
another  one  in  describing  these  statistical  characteristics? 

Maximum  likelihood  estimation  of  parameters .  So  far  in  this 
thesis,  it  has  been  assumed  that  estimates  of  transition  probabilities 
and  unconditional  probabilities  are  obtainable  from  a  set  of  observed 
data.  Actually,  it  is  true  that  the  most  intuitively  appealing 
estimators  are  the  maximum  likelihood  estimators.  An  unconditional 
probability  of  a  state  is  best  estimated  by  the  number  of  occurrences 
of  that  state  divided  by  the  total  number  of  observations.  A  transition 
probability  from  state  i  to  state  j  is  best  estimated  by  the  total 
number  of  transitions  from  state  i  to  state  j  divided  by  the  total 
number  of  transitions  from  state  i.  Anderson  and  Goodman  (1957)  give 
more  details  about  maximum  likelihood  estimation  of  parameters  under 
varying  circumstances. 

For  example,  assume  that  of  a  total  of  n  observations  and  N  states, 
there  are  n^  observations  of  state  i.  The  true  probabilities  are  given 
by  vector  V  s  (vt  Vj  .  .  v^),  where 

N-1 

VN  =  1  “  -  vi  * 
i=  1  1 

Therefore,  the  likelihood  function  is 


which  is  to  be  maximized.  Taking  the  derivative  of  In  L  with  respect 
to  each  variable  and  setting  each  derivative  to  zero,  for  each  value 
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of  i  from  i  =  1  to  N  -  1 ,  gives 


d(ln  L) 

~TTT~ 


°N 


N-1  r 
"  j=1  1 


=  0 


Solving  these  N-1  equations  for  each  vi,  the  general  solution,  which 
is  the  maximum  likelihood  estimator  for  v^  (denoted  v^),  is 

a  N 

v,  :  n,/  J  ni  =  n./n  . 

1  j=1  J  1 

So,  transition  probability  p^,  where  n^  is  the  observed  number  of 
transitions  from  state  i  to  state  j,  is 

A  N 

pij  *  V  j,  nu  • 

For  a  higher  order,  let  n^  denote  the  number  of  observations 
of  the  sequence  (i,j,  .  .,  k,l),  with  p^  being  the  corresponding 
transition  probability.  For  order  r,  each  term  has  r+1  subscripts. 

Then, 

N 

pij.  .  .kl  =  nij.  .  .kl7^  nij.  .  .kl  * 


Expected  frequencies  of  runs.  Research  which  demonstrated  the 
fact  of  persistence  in  weather  data  series  (usually  sequences  of  wet  and 
dry  days)  led  up  to  the  use  of  Markov  processes  as  a  way  to  model 
weather  data.  This  research  was  usually  based  on  a  study  of  the  fre¬ 
quencies  of  wet  and  dry  sequences  of  various  lengths. 

If  there  is  no  persistence,  the  number  of  sequences  of  length  k 
will  form  a  geometric  series.  Also,  in  two-state  Markov  processes, 
the  number  of  sequences  will  have  a  geometric  form,  even  though  there 
is  persistence  from  one  observation  to  the  next.  This  section  derives 
the  expected  number  of  sequences  of  given  lengths.  These  run  length 
frequencies  will  be  used  in  hypothesis  testing  of  Markov  order. 
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The  terminology  in  this  section  will  assume  a  data  sequence  of 
wet  (W)  and  dry  (D)  days,  although  any  other  two-state  sequence  could 
be  used.  The  total  length  of  the  series  is  N  days.  The  probability  of 
a  wet  day  is  p,  and  the  probability  of  a  dry  day  is  1-p  =  q.  The 
expected  number  of  sequences  of  length  k  is  apk  (wet)  and  b(1-p)k  =  bqk 
(dry),  since  it  is  assumed  that  the  probability  of  a  wet  (or  dry)  day 
is  unaffected  by  the  character  of  the  preceding  day,  and  where  a  and  b 
are  constants  to  be  found  from  the  data  as  described  later. 

Then,  the  total  number  of  wet  sequences  is 

2  3  2 

W  =  ap  +  ap  ♦  ap' +  .  .  .  =  ap(1  +  p  ♦  p^  +  .  .  ) 

=  ap/{1  -  p)  =  ap/q  , 

and  the  total  number  of  dry  sequences  is 
2  3 

D=bq+bq  +bq  +...=  bq/(1  -  q)  =  bq/p  . 

For  a  long  data  series,  assume  that  the  number  of  wet  and  dry  sequences 
is  equal,  or  ap/q  2  bq/p,  although  the  numbers  could  differ  by  one 
regardless  of  the  length  of  the  series. 

The  total  number  of  days  in  all  wet  sequences,  or  the  total  number 
of  wet  days,  is 

2  3 

Ny  =  ap  +  2ap  +  3ap  +  .  . 

2  3  2  3  3 

2  (ap  +  ap  +  ap  ■*■  .  .  )  +  (ap  +  ap  +  .  .  )  +  (ap  +  .  .  ) 

+  •  •  • 

2  2 

=  ap(1  +  p  +  p  +  ,  ,  )  +  ap  (1  +  p  +  .  .  )  +  ap  ( 1  +  .  .  ) 

♦  •  •  • 

2  3  2 

2  (ap  +  ap  +  ap  +  ..)(l+p  +  p+  ..) 

=  apt  1  +  p  +  p2  ♦  .  .  )2  =  ap/(1-p)2  =  ap/q2  =  pN 

Similarly,  the  total  number  of  dry  days  is  ND  =  bq/p2  =  qN.  Of  course, 

2  2 

the  total  number  of  days  is  N  =  ap/q  ♦  bq/p  =  pN  qN  =  (p  ♦  1  -  p)N  =  N. 
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2  P  2  2 

Solving  for  a  and  b,  a  =  q  pN/p  =  q  N,  b  =  p  qN/q  =  p  N. 

Sunmarizing,  the  expected  number  of  sequences  is 
Wet:  (length  k)  Wk  =  pkq2N  ,  (total)  W  =  pqN 

Dry:  (length  k)  Dk  =  p2qkN  ,  (total)  D  =  pqN  =  W. 

If  there  is  persistence,  the  procedure  is  only  slightly  modified. 
Let  x  be  the  probability  of  a  wet  day  following  a  wet  day,  and  let  y  be 
the  probability  of  a  dry  day  following  a  dry  day.  Then  the  expected 
number  of  sequences  of  length  k  is  Wk  =  axk  (wet)  and  Dk  =  byk  (dry). 

The  total  number  of  sequences  is  W  =  ax/(1-x)  (wet)  and  D  =  by/(1-y) 
(dry),  and  it  is  assumed  that  W  =  D.  The  total  number  of  wet  days  is 
=  ax/(l-x)  =  pN  and  the  total  number  of  dry  days  is  =  by/(1-y)  = 
qN,  with  Nw  +  N^  =  N.  Solving  for  a  and  b, 

a  =  pN ( 1 -x ) 2/x  ,  b  =  qN ( 1 -y ) 2/y  . 

Therefore,  the  number  of  sequences  is 
Wet:  (length  k)  Wk  =  pN(1-x)2xk_1  ,  (total)  W  =  pN(l-x) 

Dry:  (length  k)  Dk  =  qN(1-y)2yk-1  ,  (total)  D  =  qN(l-y)  =  W. 

If  there  are  more  than  two  different  states  in  a  sequence,  probably 
the  easiest  method  to  calculate  the  number  of  sequences  of  each  kind  is 
to  combine  all  states  except  one  into  one  group  and  use  this  procedure 
with  the  two  groups.  The  expected  number  of  sequences  of  the  group 
containing  only  one  state  would  give  the  expected  number  of  sequences 
of  that  state.  Repeating  this  for  each  other  state  in  turn  would  give 
the  expected  number  of  sequences  for  each  state.  For  higher  order 
dependence,  the  treatment  of  probabilities  of  short  sequences  becomes 
more  complex  and  will  not  be  considered  here. 

Statistical  testing:  the  runs  test.  According  to  Mendenhall  and 
Scheaffer  (1973,  chapter  15.6),  the  expected  number  of  runs  in  a  long 
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two-state  series  of  length  Nf  with  probabilities  p  and  1-p  =  q,  and  where 

np  is  the  number  of  observed  occurrences  of  the  state  with  probability 

p,  n  is  the  number  of  observations  of  the  other  state,  and  n„  +  n  =  N, 

q  P  q 


is 


E(R) 


2n  n 

_£_3_ 

n_  +  r 


1 


which  may  be  modified  to 


E(R)  =  2Npq  +  1  . 

Also,  the  variance  is 

V(R)  -  _ ELS _ P  9  P _ 2_  =  2pq(2pq  -  1/N) 

(np  +  nq)2(np  +  nq  "  D  1/N  -  1/N2 

and  S(R)  =  /vTrT  . 

Then,  if  a  long  series  (usually  considered  to  be  N  ^  30)  has  R 

observed  runs,  the  test  statistic 

„  R  -  E(R) 

1  =  -STR") 


has  an  asymptotic  normal  distribution  with  zero  mean  and  unit  variance. 

If  the  calculated  absolute  value  of  Z  is  smaller  than  the  critical 

value  -  for  a  selected  a  level  (note  that  a  one-tailed  test  is  used) , 
ot 

the  hypothesis  of  a  random  (nonpersistent)  series  can  be  rejected. 
However,  this  test  does  not  tell  what  the  level  of  dependence  is. 

Statistical  testing  of  observed  and  expected  frequencies  of  run 
lengths:  a  chi-squared  test.  A  large  sample  which  is  categorized  into 
frequencies  of  occurrence  of  runs  of  given  states  can  be  tested  for 
goodness  of  fit  to  a  hypothesized  distribution  by  a  chi -squared  test, 
as  described  by  Elderton  (1902)  with  slightly  different  notation. 

Assume  there  are  k  frequency  groups.  When  the  expected  number  of 
observations  in  a  group  is  less  than  5  or  10,  it  is  best  to  combine 
groups  (and  reduce  the  value  of  k  accordingly)  so  that  there  are  expected 
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to  be  at  least  5  or  10  observations  in  each  group.  The  observed  frequency 
of  the  rth  group  is  mr  and  the  theoretical  frequency  is  Mr>  Then,  the 
chi-squared  test  statistic  is 

2  k  (mr  ~  V2 

X  ‘  I  ^ 
r=1  r 

with  k-1  degrees  of  freedom  and  the  usual  rejection  regions. 

,  '< 

This  test  can  be  used  to  demonstrate  the  fact  of  persistence  in  a 
data  series  by  comparing  the  observed  frequencies  with  the  frequencies 
expected  under  the  assumption  of  no  persistence.  This  test  can  also  be 
used  to  test  the  goodness  of  fit  of  a  Markov  chain  or  other  model  if  the 
expected  frequency  can  be  calculated.  However,  this  test  does  not  test 
all  of  the  independent  probabilities  in  Markov  chains  with  more  than 
two  states  or  of  second  or  higher  order,  since  it  only  tests  sequences 
of  observations  of  the  same  state. 

A  chi-squared  test  of  Markov  order.  Several  authors,  including 
Anderson  and  Goodman  (1957),  Lee  (1973),  and  Lowrie  and  Guthrie  (1968) 
describe  a  general  test  that  a  Markov  chain  is  of  order  r-1,  rather  than 
order  r  or  higher.  The  notation  of  Lee  appears  to  be  most  convenient 
for  the  purpose  of  this  paper,  so  it  will  be  used  with  only  slight 
modifications  to  better  match  the  notation  used  elsewhere  in  this 
thesis . 

Let  nij  be  defined  as  it  was  defined  earlier  in  this  section. 

Let 

N 

nij...k-  =  nij....k1 

where  there  are  N  states.  Also,  all  probabilities  are  maximum  likelihood 
estimators  (although  superscripts  indicating  that  these  values  are 


A3 


estimates  will  be  omitted) ,  so 


with 


pij...kl  =  nij...ki/nij...k-’  3X1(1 
M  N 

pj...kl  =  ^  nij. . •kl/i^i  nij- • *k-  • 

Then,  the  chi-squared  test  statistic  is 

H  :  chain  is  of  order  r-1 
o 

H  :  chain  is  order  r  but  not  order  r-1 
a 


,  N  N 

<2  =  l  l 

i=1  j=1 


N  N 

l  l 

k=1  1=1 


nij..k-l 


^ij.-.kl  ~  pj. 


j. . .kl 


Nr_1(N-1)2  degrees  of  freedom. 
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IV.  Literature  Survey  -  Statistical  Analysis  of  Time  Series 
Weather  bata 

Historical  Overview 

Attempts  to  apply  Markov  concepts  to  the  analysis  of  weather  data 
bases  are  mostly  recent,  but  the  number  of  published  applications  of 
Markov,  semi-Markov,  and  modified  Markov  processes  is  already  quite 
large.  Therefore,  this  will  not  be  an  exhaustive  survey  of  the  subject, 
but  will  mainly  focus  on  summarizing  methods  used  and  significant  results 
obtained,  with  a  goal  of  obtaining  insights  that  will  help  in  modeling 
a  two-dimensional  data  base. 

The  fact  that  weather  is  persistent,  or  that  there  is  a  positive 
correlation  between  observations,  was  demonstrated  statistically  as 
early  as  1916  (Newnham).  Newnham  and  other  authors  found  fewer  short 
sequences  and  more  long  sequences  than  would  be  expected  if  the  data  is 
randomly  distributed. 

By  the  early  1950's,  various  authors  attempted  to  fit  various 
kinds  of  logarithmic  series  to  the  frequencies  of  the  lengths  of  wet 
and  dry  spells.  Also  at  this  time,  Brooks  and  Carruthers  (1953)  sug¬ 
gested  the  possibility  of  an  underlying  Markov  chain,  although  their 
model  was  a  modification  of  a  basic  Markov  process. 

Possibly  the  first  published  account  of  an  application  of  a  Markov 
chain  to  a  weather  data  sequence  considered  sequences  of  wet  and  dry 
days  during  the  winter  season  at  Tel  Aviv  (Gabriel  and  Neumann,  1962). 
This  study  considered  Markov  chains  up  to  third  order.  Within  a  few 
years,  many  published  articles  reported  on  the  examination  of  other  data 
and  reexamination  of  the  earlier  studies  under  the  Markov  model.  Many 
of  these  articles  used  modified  Markov  or  non-Markov  models  to  improve 


the  fit.  Most  studies  considered  only  two-state  Markov  chains,  usually 
sequences  of  wet  and  dry  days.  One  of  the  first  published  weather  models 
considering  more  then  two  states  is  by  Lee  (1973). 

Extending  Markov  concepts  to  two  or  more  dimensions  was  a  nearly 
unexplored  subject  until  recently,  with  possibly  the  first  published 
articles  being  written  by  Galbraith  and  Walley  (1976)  and  Pickard  (1977). 
Pickard's  article  described  a  homogeneous  generation  scheme  in  two 
dimensions  that  leads  to  a  stationary  Markov  field,  a  grid  (or  lattice) 
for  which  the  probability  distribution  of  states  at  each  point  depends 
only  on  the  states  that  are  observed  on  a  specified  set  of  neighboring 
points.  The  concept  of  a  Markov  field  had  been  previously  developed,  but 
simulating  such  data  fields  had  been  done  mainly  by  regress ion- type 
methods. 

Statistical  Analysis  of  Persistence 

First,  a  few  concepts  that  led  up  to  the  first  published  Markov 
applications  should  be  summarized.  As  stated  before,  if  there  is  no 
persistence,  the  number  of  wet  (or  dry)  spells  of  various  lengths  would 
conform  to  a  geometric  series  of  the  form  y  =  ax11,  where  y  =  number  of 
spells  (runs)  of  length  n,  x  =  unconditional  probability  of  a  wet  (or 
dry)  day,  and  a  is  3  constant  determined  from  the  data. 

Jorgensen's  (19A9)  treatment  of  persistence  is  possibly  the  most 
sophisticated  early  approach,  and  his  methods  were  designed  to  be 
suitable  for  operational  use.  He  assumed  that  a  geometric  progression 
would  determine  the  expected  frequencies  of  lengths  of  winter  wet  and 
dry  spells  in  San  Francisco,  in  the  absence  of  persistence.  By  usii\g 
charts,  he  showed  a  comparison  of  the  probability  of  X  more  wet  (or 
dry)  days,  given  that  it  has  been  wet  (or  dry)  for  M  consecutive  days. 


with  the  probabilities  given  by  the  geometric  series.  These  charts 
showed  relatively  little  variation  of  the  probabilities  for  different 
values  of  M  (since  M  was  at  least  1),  which  helped  to  justify  the  Markov 
approaches  used  by  later  investigators. 

After  several  similar  studies  demonstrated  the  fact  of  persistence 
in  many  weather  sequences,  several  authors  attempted  to  fit  various  kinds 
of  series  to  the  frequency  counts  of  lengths  of  wet  and  dry  spells.  For 
example,  Williams  (1952)  and  Cooke  (1953)  fit  logarithmic  series  of  the 
form  y  =  (axn)/n  to  rainfall  data  for  Harpenden,  England,  and  Moncton, 

New  Brunswick.  Also,  Williams  discussed  the  logical  difference  between 
a  wet  and  a  dry  day.  A  dry  day  requires  at  least  24  to  48  hours  of  no 
precipitation,  since  (for  example)  if  precipitation  stops  immediately 
after  the  first  observation  on  one  day  and  resumes  just  before  the  obser¬ 
vation  at  the  end  of  the  next  day,  the  nearly  48-hour  dry  period  would 
result  in  no  dry  observational  days.  A  wet  day  could  result  from  only 
a  few  seconds  or  minutes,  or  up  to  24  hours,  of  precipitation  during 
the  observational  day.  Becauoe  of  this  logical  difference,  and  the  fact 
that  a  precipitation  data  series  consists  of  alternating  dry  and  wet 
periods,  which  could  be  controlled  by  different  processes,  these  authors 
tested  the  wet  and  dry  sequence  frequency  counts  separately. 

Longley  (1953)  used  a  logarithmic  series  of  the  form  In  y  =  ax  «■  b, 
and  tested  statistically  for  deviations  of  the  logarithms  of  frequencies 
of  both  dry  and  wet  spells  with  various  lengths  from  a  straight  line. 

He  also  used  the  frequency  counts  to  estimate  the  probability  of  a  wet 
day  folowing  a  dry  day,  and  a  dry  day  following  a  wet  day.  While  he 
compared  these  (maximum  likelihood  estimate'  probabilities  with  proba¬ 
bilities  derived  from  the  slopes  of  the  least-squares  lines  for  dry  and 
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wet  spells,  and  found  some  substantial  differences,  he  was  very  close 
to  using  a  Markov  model  since  the  frequency -count  procedure  derived  all 
of  the  probabilities  needed  for  a  first-order  Markov  chain. 

In  all  of  these  cases,  the  statistical  treatment  of  persistence 
resulted  in  more  realistic  descriptions  of  the  distribution  of  the 
lengths  of  dry  and  wet  spells.  However,  these  models  did  not  result  in 
a  simple-to-use  scheme  for  simulation  which  would  generate  simulated 
data  series  with  the  characteristics  that  were  found  to  be  a  satisfac¬ 
tory  explanation  for  the  observed  data. 

Two-State  Markov  Chains 

Since  the  statistical  methods  for  Markov  chains  were  quite  well 
developed  before  any  published  applications  to  weather  data  were  made, 
some  of  the  earliest  articles  did  not  confine  their  attention  to  first- 
order  Markov  chains. 

Probably  the  earliest  published  paper  is  that  of  Gabriel  and 
Neumann  (1962).  This  paper  considered  Markov  chains  up  to  third  order, 
based  on  winter  (December  to  February)  precipitation  at  Tel  Aviv  from 
1923-24  to  1949-50.  Meteorological  reasoning  using  comparisons  of 
daily  precipitation  probabilities  for  each  month  was  used  to  justify 
treating  these  months  together.  The  published  data  was  given  in  a  form 
suitable  for  immediately  calculating  the  transition  probabilities. 

Also,  chi-squared  tests  demonstrated  that  the  model  fit  the  data 
reasonably  well,  even  for  first  order.  The  advantage  of  a  Markov  model 
was  demonstrated,  since  only  a  small  number  of  parameters  were  required, 
from  two  for  a  first-order  model  to  eight  for  a  third-order  model.  All 
of  the  maximum  likelihood  estimates  of  transition  probabilities  were 
easily  calculated  from  the  data. 


Due  to  the  success  of  this  attempt  to  fit  weather  data  as  a  Markov 
process,  investigators  such  as  Caskey  ( 1963)  and  Weiss  (1964)  considered 
two-state  Markov  chains  of  first  order.  Articles  such  as  those  written 
by  Sakamoto  (1970),  Lowrey  and  Guthrie  (1963),  Feyerherm  and  Bark  (1967), 
Denny,  Kisiel,  and  Yakowitz  (1974),  and  Medhi  (1976)  considered  other 
orders  (up  to  sixth  order  in  Denny,  Kisiel,  and  Yakowitz's  paper),  with 
the  use  of  chi-squared  tests  to  determine  the  most  appropriate  order  for 
the  Markov  chain  obtained.  Sundararaj  and  Ramachandra  (1975)  considered 
only  first-order  two-state  Markov  chains,  but  applied  separate  chi- 
squared  tests  to  the  frequencies  of  lengths  of  dry  and  wet  spells  to 
show  their  individual  contributions  to  the  goodness  of  fit  of  the  total 
Markov  transition  process.  Feyerherm  and  Bark  (1965)  not  only  considered 
Markov  chains  of  order  0  (unconditional  probabilities),  1,  and  2,  but 
also  allowed  the  transition  probabilities  to  vary  smoothly  throughout 
the  year.  Their  method  was  to  use  a  Fourier  series  with  harmonics  of 
1  to  12  periods  per  year  to  describe  a  smoothed  annual  curve  for  each 
independent  probability  involved.  Since  such  smoothed  transition 
probabilities  could  not  be  statistically  tested  for  goodness  of  fit, 
chi-squared  tests  were  performed  on  transition  matrices  based  on  several 
40-day  periods  during  the  year. 

Several  authors,  including  Green  (1964,  1965,  and  1970)  and  Wiser 
(1965)  considered  modified  Markov  models,  which  are  beyond  the  scope  of 
this  paper.  Green’s  11964)  model  was  derived  from  a  continuous-time 
model.  These  models  could  be  considered  equivalent  to  Markov  chains  of 
very  high  order  (theoretically  infinite  order  in  some  cases)  or  semi- 
Markov  models. 


An 


Markov  Chains  with  More  Than  Two  States 

Not  much  work  has  been  reported  describing  Markov  chains  with  more 
than  two  states .  This  is  probably  because  of  several  reasons :  ( 1 )  More 

independent  parameters  must  be  estimated.  (2)  The  statistical  tests 
may  be  less  conclusive.  (3)  Few  additional  theoretical  insights  are 
expected  to  be  obtained.  (4)  Some  of  the  theoretical  results,  such  as 
exact  formulas  for  expected  values  and  variances,  are  not  as  easy  to  ex¬ 
press  as  when  there  are  only  two  states.  (5)  Probably  many  studies  have 
been  performed  for  operational  use  and  simulation,  for  which  exact 
statistical  goodness  of  fit  is  less  important,  and  for  which  results  and 
methods  have  not  been  widely  reported. 

One  study  which  is  in  the  operational  category  is  by  Yost  and 
Aronson  (1977),  which  presents  an  11  x  11  transition  matrix  for  average 
daily  cloud  cover  (from  0  to  10  tenths)  at  Albuquerque.  This  model 
assumes  homogeneity  although  the  unconditional  probability  distribution 
for  cloud  cover  varies  throughout  the  year.  The  reasonableness  of  the 
resulting  model  was  only  discussed  subjectively,  but  this  was  good 
enough  for  the  operational  use  of  the  model  in  this  case. 

Another  interesting  operational  approach  designed  for  simulation 
was  used  by  Haan,  Allen  and  Street  (1976)  to  simulate  sequences  of  daily 
rainfall.  The  daily  precipitation  (measured  to  the  nearest  .01  inch) 
was  separated  into  the  following  seven  classes:  none  or  trace,  .01- 
.02”,  .03-. 06",  .07-. 14”,  .15-. 30”,  .31-. 62",  and  at  least  .63”. 
Transition  matrices  using  these  seven  classes  were  computed  for  each 
month  of  the  year.  These  transition  matrices  could  be  used  to  simulate 
a  precipitation  sequence  by  classes  of  precipitation  amounts.  However , 
simulated  precipitation  amounts  to  the  nearest  .01  inch  were  desired, 
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so  uniform  probability  distributions  were  used  within  the  middle  five 
classes,  and  a  shifted  exponential  distribution  in  the  highest  precipita¬ 
tion  class.  After  the  precipitation  class  for  a  day  was  determined, 
another  random  number  was  drawn  to  produce  a  precipitation  value  to  the 
nearest  .01  inch.  The  use  of  uniform  probability  distributions  in  the 
middle  five  classes  resulted  in  a  slight  overestimation  of  the  average 
annual  precipitation  in  the  simulations,  and  methods  were  described  for 
dealing  with  this  if  the  differences  were  considered  significant. 

One  of  the  most  complete  treatments  of  the  subject  of  fitting 
Markov  chains  to  weather  data  sequences  is  that  of  Lee  (1973).  He 
considered  semi-Markov  processes,  as  well  as  Markov  processes  of  up  to 
third  order,  and  the  chi-squared  statistical  tests  of  order,  for  data 
series  of  seven  different  weather  variables.  The  Markov  processes  con¬ 
sidered  had  either  two  or  three  states.  Satisfactory  fits  were  obtained 
in  most,  but  not  all,  cases.  Where  the  model  did  not  fit  the  data 
satisfactorily,  the  limited  amount  of  data  prevented  higher  order  models 
from  being  tested. 

Lameiro  and  Bryson  (1978)  considered  a  vector  state  with  three 
components  to  simulate  a  weather  data  series  for  evaluating  the  per¬ 
formance  of  a  solar  heating  system.  The  three  components  were  temper¬ 
ature  of  stored  water,  insolation  intensity,  and  ambient  (outside  air) 
temperature,  each  with  several  possible  values.  The  simulation  used  a 
first-order  process  to  generate  values  of  the  three  variables  for  each 
hour  and  produced  fairly  accurate  results,  even  though  it  is  known  that 
significant  dependence  of  the  values  of  these  variables  will  extend 
over  a  much  longer  period  than  one  hour.  Of  course,  even  a  second-order 
matrix  probably  would  have  been  too  complex  to  use  in  this  simulation. 
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Air  Weather  Service  Technical  Report  77-273  (Miller,  1977, 
chapter  7)  summarizes  various  methods  of  Markov  simulations  of  weather 
data,  with  a  focus  on  multi-state  and  vector  state  phenomena.  Two 
methods  of  dealing  with  large  numbers  of  states  are  described,  both 
based  on  linearization  of  the  expanded-state  Markov  process. 

Semi -Markov  Applications 

Since  it  is  well  known  that  discrete  semi-Markov  processes  can 
be  converted  into  expanded-state  discrete  Markov  processes,  and  since 
it  appears  that  semi-Markov  processes  are  fundamentally  one-dimensional, 
only  a  limited  bibliography  will  be  provided  for  the  benefit  of  users. 

A  fairly  extensive  general  treatment  of  semi-Markov  processes  in  both 
continuous  and  discrete  time,  with  examples,  was  given  by  Nunn  and 
Desiderio  (1977).  Few  applications  of  semi -Markov  processes  to  weather 
data  have  been  published. 

Lee  (1973)  included  semi-Markov  models  in  his  treatment  of  seven 
weather  variables.  He  described  a  testing  procedure  to  determine  if  a 
Markov  or  semi-Markov  model  fits  the  data  best. 

Merrill  (197^)  developed  a  10-state  semi -Markov  model  for  ceiling 
and  visibility  categories  at  Bitburg  Air  Base,  Germany.  This  model 
was  based  on  hourly  observations,  included  holding  times  of  up  to  100 
hours,  and  had  a  different  set  of  transition  matrices  for  each  of  four 
seasons.  While  this  model  would  contain  over  10,000  parameters  for  each 
season  ( 10  x  10  x  100  holding  times,  10  x  10  transition  probabilities, 
and  10  unconditional  probabilities,  with  some  values  zero  because  only 
real  transitions  were  considered),  few  cases  of  holding  times  over  30 
hours  occurred.  He  suggested  that  the  number  of  parameters  could  be 
reduced  either  by  reducing  the  maximum  holding  times  to  between  25  and 
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40  hours,  depending  on  the  season,  or  by  replacing  the  holding  time 
probabilities  by  formal  distributions.  Other  possibilities  would  be  ( 1 ) 
to  consolidate  the  holding  time  categories  into  a  smaller  number  of 
nonuniform  holding  time  intervals,  with  random  procedures  similar  to 
those  used  by  Haan,  Allen,  and  Street  (1976)  to  select  any  integer 
holding  time  length  within  each  interval,  or  (2)  to  allow  virtual  trans¬ 
itions,  with  a  probability  equal  to  the  probability  of  exceeding  the 
maximum  holding  time  in  the  matrix.  The  holding  time  in  case  of  a 
virtual  transition  would  be  set  to  the  maximum  holding  time,  so  that 
a  state  could  (in  rare  cases )  persist  a  very  long  time,  just  as  it  does 
in  nature. 

Alternative  Statistical  Tests  of  Markov  Order 

Determining  the  order  of  a  Markov  chain  is  actually  a  multiple 
decision  process,  and  is  a  subjective  decision  if  chi-squared  tests  are 
used  and  there  are  more  than  two  feasible  candidates  for  the  order  of 
the  Markov  chain.  This  is  because,  as  the  order  of  the  Markov  chain 
used  to  fit  a  data  series  is  increased,  the  quality  of  the  fit  increases, 
until  in  the  extreme  case  of  a  chain  of  order  n  -  1  based  on  a  series  of 
n  observations,  the  series  is  exactly  specified  with  no  randomness. 

This  quality  of  fit  is  expressed  as  a  significance  level,  and  the  lowest- 
order  model  that  exceeds  an  arbitrarily  determined  level  of  signifi¬ 
cance  is  accepted  as  the  "true"  Markov  order  of  the  data.  If  a  differ¬ 
ent  significance  level  is  chosen,  a  different  model  may  be  chosen.  For 
example,  Gabriel  and  Neumann  (1962)  chose  a  first-order  model  to  fit 
their  data  based  on  a  significance  level  of  .05.  If  the  chosen  level 
had  been  .10,  a  second-order  model  would  have  been  required  to  fit  the 
data. 
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A  second  type  of  approach  is  the  likelihood  ratio  approach, 
first  considered  by  Bartlett  (1951).  Basically,  this  method  determines 
the  likelihood  of  each  alternative  considered  and  uses  an  empirical 
procedure  based  on  a  ratio  of  the  likelihoods  to  choose  the  proper 
Markov  order.  Two  methods  using  this  approach  involve  the  Akaike 
Information  Criterion  (Tong  1975,  Gates  and  Tong  1976,  and  Chin  1977) 
or  the  Schwarz  Bayesian  Criterion  (Schwarz  1978) .  Each  method  is  based 
on  a  slightly  different  assumption  about  the  amount  of  "information" 
to  discriminate  between  orders  (or  competing  models)  considered  in  the 
test.  Both  methods  attempt  to  balance  between  underfitting,  which  has 
a  large  variance,  and  overfitting,  which  requires  a  large  number  of 
parameters  (Gates  and  Tong  1976). 

Both  models  contain  a  term  which  will  be  denoted  by  H  ,  using  the 

ic  in 

notation  of  Chin  (1977).  This  term  equals  (-2)lnA,  ,  where  A,  is 

k,m'  k,m 

the  ratio  of  the  maximum  likelihood  under  the  assumption  of  order  k  to 
the  maximum  likelihood  assuming  order  m,  with  k  smaller  than  m.  The 
use  of  the  logarithm  of  the  likelihood  is  convenient  because  of  the  size 
of  the  numbers  typically  involved,  and  because,  under  certain  conditions 
which  are  fulfilled  by  these  Markov  chain  models,  (-2)lnA.  converges 
in  distribution  to  a  chi-squared  variable  with  a  number  of  degrees  of 
freedom  equal  to  the  number  of  independent  parameters  in  the  model 
(Mendenhall  and  Scheaffer,  section  10.3,  1973). 

For  a  sequence  of  n  data  points,  the  likelihood  is 

L  -  P(x1)  PtXglx^)  |x.j  ^2) .  .  .  P(Xn|x^,X2»  .  .  .  xn_-|)  • 

If  a  Markov  chain  of  order  r  is  assumed,  the  dependence  extends 
to  no  more  than  the  past  r  observations,  and  the  likelihood  becomes 
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L*P  -  P(x,)  P^lx^ )  .  .  .  1 1 x i »  •  •  Xj, )  P ( xp+2 lx2»  •  •  Xp+  •)  ^ 

*  *  '  PtxJxrwN-r  *  *  *  xn-1] 


=  P(x ^  p[  P(Xj|x1 ,  .  .  Xj_,)  f]  P(xi|xi_p,  .  .  xi-1)  , 
j=2  i=r+1 

where  Lp  indicates  likelihood  under  the  assumption  of  order  r. 

More  specifically,  for  order  0  (unconditional  probabilities', 


for  order  1, 


Plxi)  ’ 

L,  =  PI*,)  ^Plxjx^,)  . 


and  for  "order  -1"  (equal  probabilities,  with  S  states), 

L_1  =  s-n  . 

Maximum  likelihood  estimates  based  on  the  observations  are  used  for 
the  true  probabilities,  which  are  unknown.  Where  n.  .  .  .  is  the  number 

1 J • • • KJ. 

of  observed  sequences  of  transitions  from  i  to  j  to  .  .  to  k  to  1,  the 

maximum  likelihood  estimate  of  a  transition  probability  is 
A  S 

pij...ki  =  P(xilxi*xj»  •  •  V  =  nij...ki  /J1  nij...ki  * 

It  is  easiest  to  calculate  the  likelihood  ratios  for  successive 

orders.  In  general.  Xr_,r  =  Lr_,/Lr,  and  X^  =  '  ' 

\n_1  m-  A  convenient  formula  to  calculate  each  X  value,  according  to 

Gates  and  Tong  (1976),  is 


r-lr  =  ( -2 )  lnX 


-  I  XL  , 

r-i  ,r 

S  S  S  S 

=  2  l  V  .  .  .  II  n  kl 

i=1  j=1  k=1  1=1 


/ In  - In  - \ 

\  j^iJ.e.kl  j^J.e.kl/ 
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This  quantity  is  distributed  as  a  chi -squared  variable  with  Sr+1  -  Sr 
degrees  of  freedom  and  may  be  tested  for  significance  if  desired.  For 
the  number  of  degrees  of  freedom  is  s”1*1  -  Sm  -  Sk+1  +  Sk.  As  an 
example  from  Gates  and  Tong  (1976),  a  2-state  chain  of  n  =  120  points 
has  42  observations  of  state  0  and  78  of  state  1.  For  first-order 
transitions  reading  from  left  to  right,  there  are  25  observations  in 
vector  state  (0,0),  17  in  (0,1),  16  in  (1,0),  and  61  in  (1,1),  a  total 
of  119  sequences  of  two  observations. 

The  terms  of  QH1  are  as  follows: 


i  =  0 

i  =  0 

i  =  1 

i  =  1 


n, 


,  j  =  0:  2n  (In  —  -29  —  . 

oo  nQ0  +  nQ1 

=  2  (25)  (In  55  17  -  ln  ) 


n, 


01 


,  3  =  1 :  2nn,  (In  - 

01  n00  +  n01 

=  2  "7>  (ln  sr-rr?  - ln  vs  ' 


16 


»  J  =  0:  2  (16)  (ln  TTTTT  " 


61 


,  j  =  1:  2  (61)  dn^y- 


n0 

lnT23  1 

=  2  (13.2750827) 
n1 

In  jzq  ) 

=  2  (-8.0524471) 

In  )  =  21-8. 3423132) 

in  )  =  2(  12.06S9329) 
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so  =  17.89976.  This  has  been  found  to  be  very  sensitive  to  rounding. 

Also,  attempting  to  "correct"  the  first-order  frequency  counts  so  they 

total  120  instead  of  119  will  significantly  change  the  likelihood  ratio. 

2  1 

The  number  of  degrees  of  freedom  is  2-2  =  3. 

As  a  check,  use  likelihood  ratios: 

in  Lq  =  nQln  (-^)  +  n}ln  (-^J)  =  42  ln  ^  +  7S  ln  ^ 


=  -77.69359666  . 

ln  =  ln  Pfx^  +  ln  P(x2 1 x^ )  ♦  .  .  ♦  ln  PIx^qIx^.) 


56 


tv 


n, 


l00 


=  In  (-rwr)  +  nAAln  (- - 

00  nQ0  +  nQ1 


)  -  nQ1ln  (- 


‘01 


00  +  n01 


•1 


n10  ni 1 

+  ninln  (- - --)  +  n..ln  (- - Li__ 

10  n1(j  +  n1 1  nio  +  n1 1 


=  In  +  25  In  -jrj  +  17  In  ^  +  16  In  +  61  In  rsr? 
=  -68.74371533 


(-2)  In  (Lq/L^  =  (-2)  (In  LQ  -  In  =  -2 ( -8 . 94938033 ) 
=  17.39976  . 


The  first  term  in  the  L1  equation  should  be  explained  further. 
Basically,  in  the  series  of  120  observations,  considering  sequences  of 
two,  there  are  119  pairs.  For  25  (  =  nQ0)  +  16  (  =  n1Q)  pairs  =  41 
pairs,  the  last  number  of  the  pair  is  0.  For  17  (  =  nQ1)  +61  (  :  n^' 
pairs  -  78  pairs ,  the  last  number  is  1 .  Since  there  are  42  observa¬ 
tions  of  state  0  and  7S  observations  of  state  1 ,  and  the  first  number 
in  the  series  is  not  the  last  number  of  any  of  the  pairs,  it  must 
always  be  0.  The  first  number  of  the  sequence  is  0  and  there  are  41 
cases  where  the  last  number  of  a  pair  is  0.  If  the  first  observation 
were  state  1 ,  it  would  not  be  possible  to  have  the  observed  number  of 
pairs.  This  situation  is  automatically  accounted  for  by  using  the 

„  ,H„  formula, 
r-i  r 

Based  on  Akaike's  (1974)  development,  the  Akaike  Information 
Criterion  (AIC)  is  defined  as 

(-2)  In  (maximum  likelihood)  +  (2)  (number  of  parameters! 

=  kHa  -  2  (S^  -  Sm  -  +  Sk  )  , 

where  m  is  the  maximum  order  considered,  k  is  any  integer  from  0  to  m, 
and  S  is  the  number  of  states.  If  k  =  m,  AIC  =  0.  Akaike's  method  is 
called  Minimum  AIC  Estimate,  or  MAICE,  and  consists  of  choosing  the 


57 


minimum  value  of  AIC  for  k  =  0  to  m.  This  means  that  the  data  can  be 
best  modeled  by  a  chain  of  order  k.  If  the  AIC  values  are  all  positive, 
except  for  AIC  =  0  when  k  =  m,  the  test  is  inconclusive,  since  all  it 
can  indicate  is  that  the  Markov  order  is  m  or  larger. 

As  defined  by  Schwarz  (1978),  the  Schwarz  Bayesian  Criterion  (SBC) 
is 

SBC  =  kHn  -  (S^1  -  Sr‘  -  Sk+1  +  Sk)  (In  n)  , 

where  n  is  the  number  of  observations.  Similarly  to  MAICE,  the  method 
is  to  find  the  value  of  k  that  minimizes  SBC.  Under  Schwarz's  assump¬ 
tions,  this  test  has  a  smaller  probability  of  choosing  the  wrong  order 
of  k  than  MAICE.  It  definitely  leans  toward  lower  orders  when  there 
are  at  least  eight  observations . 

Gates  and  Tong  (1976)  used  the  MAICE  method  to  examine  rainfall 
and  sunshine  data  for  Manchester  and  Liverpool  and  found  that  chains  of 
low  order  (up  to  2)  fit  the  data  best.  They  reexamined  Gabriel  and 
Neumann's  (1962)  data  and  concluded  that  a  second-order  model  was 
better  than  a  first-order  model,  which  was  different  from  the  original 
conclusion.  Chin  (1977)  examined  January-February  and  July-August 
precipitation  data  for  over  100  stations  in  the  continental  United 
States  from  1949  to  1973.  The  appropriate  Markov  order  was  plotted  on 
maps.  Winter  precipitation  was  modeled  in  most  areas  by  chains  of  order 
2,  while  sterner  precipitation  was  usually  satisfactorily  modeled  by 
first-order  chains.  Areas  which  deviated  from  these  orders  were  in 
well-defined  geographical  areas,  and  some  tentative  hypotheses  for  the 
physical  reasons  for  these  deviations  were  given. 

The  SBC  method  is  so  new  that  few  applications  have  been  published, 
although  it  would  be  easy  to  compare  with  MAICE.  Katz  v 1979 '  disputed 
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Gate  and  Tong's  ( 1976)  conclusion  that  a  second-order  Markov  chain 
fit  the  Tel  Aviv  data  better  than  a  first-order  chain,  and  stated  that 
the  MAICE  method  has  a  .135  probability  of  estimating  that  the  order 
is  2  when  it  should  be  1. 
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V.  Additional  Topics 

Characteristics  of  Two-Dimensional  Real  Weather  Variables  and  Observed 
Weather  Data 

The  statistical  characteristics  of  real  weather  variables  and 
observed  weather  data  in  two  dimensions  are  similar  to  the  characteris¬ 
tics  of  weather  data,  in  general,  as  described  in  chapter  II.  A  review 
of  the  characteristics  will  be  given  here  to  point  out  some  issues  that 
are  important  when  attempting  to  model  a  two-dimensional  data  field.  It 
is  assumed  here  that  time  is  a  constant,  so  that  the  two  dimensions  are 
area  dimensions,  such  as  north-south  and  east-west. 

Real  two-dimensional  weather  data  is,  of  course,  still  continuous, 
deterministic,  dependent,  multivariate,  nonlinear,  nonhomogeneous , 
occurring  on  different  scales  of  motion,  and  statistically  distributed. 
Observed  weather  data  is  still  discrete,  inaccurate,  and  incomplete. 

The  issue  of  dependence,  or  correlation,  is  important  enough  to  be  dis¬ 
cussed  several  ways,  such  as  in  terms  of  inverse  probabilities,  correla¬ 
tion,  conditional  probabilities,  and  directionality. 

It  will  be  assumed  here  that  the  data  has  been  standardized  as 
much  as  appropriate  and  feasible.  For  example,  a  grid  of  temperature 
data  over  a  large  area  such  as  the  eastern  United  States  should  be 
standardized  so  that  the  data  values  have  the  same  mean  and  standard 
deviation.  The  data  values  could  then  be  considered  random  components 
or  residuals.  After  they  are  statistically  described  and  a  data  field 
of  residuals  is  simulated,  the  mean  and  standard  deviation  at  each 
point  would  be  used  to  transform  the  data  back  into  the  format  of  a 
temperature  field. 
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Inverse  Probabilities  and  Reversibility 

When  using  a  Markov  chain  for  simulation  where  the  independent 
variable  is  distance  along  a  line  rather  than  time,  there  is  no  funda¬ 
mental  direction  and  it  is  desirable  to  see  if  the  transition  probabil 
ity  matrix  is  the  same  regardless  of  direction.  Also,  in  some  cases, 
it  is  useful  to  reverse  time  and  reconstruct  a  Markov  chain  backwards 
in  time.  For  example,  we  may  want  to  achieve  a  certain  result  tor 
state)  ten  years  in  the  future  and  we  have  some  control  over  policies 
in  the  near  future.  Repeated  simulation  runs  starting  ten  years  in 
the  future  with  the  desired  state  and  working  back  to  the  present  may 
give  a  range  of  policies  that  might  possibly  lead  to  the  goal.  Then, 
simulation  using  each  of  the  possible  policies  forward  into  the  future 
would  show  how  likely  each  of  the  policies  is  to  reach  the  goal. 

The  probabilities  for  a  Markov  chain  in  a  reversed  direction  are 
called  inverse  probabilities.  Of  course,  these  probabilities  must  be 
between  0  and  1  and  the  inverse  transition  probability  matrix  must  be 
a  stochastic  matrix.  The  inverse  TPM  will  be  denoted  by  p(_1)  (in  con¬ 
trast  to  p-1,  the  inverse  of  the  TPM),  with  elements  p^"1*,  the 
probability  that  an  observation  in  state  j  is  preceded  by  an  observa¬ 
tion  in  state  i.  If  matrix  P*  ^  is  used  to  construct  a  sequence  back¬ 
ward  in  time  (or  distance) ,  it  will  construct  a  chain  with  the  same 
statistical  properties  as  when  using  P  forward  in  time.  Also,  using 
Pi-1)  to  construct  a  forward  sequence  is  equivalent  to  using  P  to  con¬ 
struct  a  sequence  backwards  in  time  or  distance. 

It  is  easy  to  demonstrate  that  a  Markov  chain  is  not  automatically 
reversible,  or  that  the  probabilities  are  not  the  same  in  both  direc¬ 
tions,  by  giving  a  counterexample.  Let  the  TPM  be 


.8 


.1 


.1 


P  = 


.4  .5  .1 


LO  .1  .9 


with  states  1,  2,  and  3.  If  the  system  is  in  equilibrium,  the  uncon¬ 
ditional  probability  vector  V  can  be  shown,  by  solving  VP  =  V,  to  be 
V  =  l  .3  .  l5  .5  1,  where  a  bar  over  a  digit  or  set  of  digits  indi¬ 
cates  infinite  repetition,  such  as  .3  =  .33333.  .  .  This  transition 
matrix  will  be  used  frequently  as  an  example. 

In  this  example,  the  probability  of  a  transition  to  state  3, 
given  that  state  1  is  the  present  state,  is  .1.  However,  if  we  assumed 
that  P  could  also  be  used  to  construct  the  sequence  backwards,  the 
probability  of  state  3  preceded  by  state  1  would  be  p31,  which  is  zero. 

It  is  also  possible  to  show  that  the  inverse  of  the  TPM,  or  p-1 , 
is  not  in  general  the  inverse  TPM.  Using  the  same  example,  the  inverse 
is 


1.375 

-.25 

-.125 

-1.125 

2.25 

-.125 

.125 

-.25 

1.125 

which  is  not  a  stochastic  matrix  even  though  its  rows  still  add  up  to  1 . 

To  lead  to  a  development  of  the  inverse  TPM,  it  is  useful  to  con¬ 
sider  the  expected  unconditional  probability  that  a  randomly  selected 
observation  will  be  in  state  i  followed  by  state  j,  assuming  a  steady 
state  situation.  The  unconditional  probabilities  can  be  expressed  in 
the  form  of  a  matrix,  which  will  be  called  the  expected  proportion 
matrix  (EPM).  It  is  denoted  by  E,  with  elements  E^,  the  unconditional 
probability  that  a  randomly  selected  observation  is  in  state  i  and  is 
followed  by  state  j. 
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The  EPM  is  obtained  from  the  TPM  by  multiplying  each  row  by  the 


unconditional  probability  for  that  row,  or  E^  =  p^v^.  For  example, 
using  P  as  above,  the  third  row  is  [  0  .1  .9]  and  the  unconditional 

probability  of  state  3  is  .5.  Therefore,  the  unconditional  probability 
that  a  sequence  of  two  consecutive  observations  will  be  (3,2)  is  (.1) 
(.5)  =  .05.  The  entire  EPM  is 


_.S(.3) 

.1(.3) 

.1(.3) 

".25 

.03 

.03  " 

E  = 

.4(.l5) 

•5( . 16) 

.  i  ( .15) 

= 

.05 

.031 

.oi5 

_  0( .5) 

.1(.5) 

•9( .5) 

_  0 

.05 

.45  _ 

The  sum  of  all  elements  of  an  expected  proportion  matrix  is  1 . 

When  a  sequence  of  observations  is  considered  in  reverse  order, 
the  EPM  based  on  the  original  sequence  is  transposed  to  give  the  uncon¬ 
ditional  probabilities  of  pairs  of  states  in  the  reversed  sequence. 

For  example,  the  number  of  occurrences  of  (2,1)  in  forward  order 
(  =  E21 )  is  the  same  as  the  number  of  pairs  of  (1,2)  in  reverse  order. 
The  probability  obtained  from  this  frequency  count  is  equal  to  the 
probability  that  a  randomly  selected  observation  is  in  state  1  and  is 
preceded  by  an  observation  of  state  2.  Since  the  unconditional  proba¬ 
bility  that  a  single  observation  is  in  state  1  is  constant  regardless 
of  direction,  the  conditional  probability  that  an  observation  is  in 
state  i,  given  that  it  is  preceded  by  state  j,  is  p^”15  =  Ejj/Pi  = 
PjiPj/pi  =  p(xilxj'*  This  result  is  confirmed  by  Feller  (1968,  chapter 
XV,  section  12).  In  this  case, 
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so, 
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It  can  be  easily  checked  that  VP  =  VP(-1^  =  V  in  this  case.  The  differ¬ 
ence  between  P  and  is  fairly  substantial  here,  but  in  most  real 

data,  the  differences  should  be  smaller.  Note  that  if  P  =  Pl -1 ' ,  then 


pijpi  =  pji.pj* 

If  it  is  desired  to  develop  a  nondirectional  matrix  from  the  data, 
an  intuitively  appealing  procedure  is  simply  to  count  the  total  number 
of  forward  and  backward  transitions  of  each  kind  by  considering  the 
data  sequence  both  forward  and  backwards.  The  TPM  obtained  from  this 
procedure  is  equivalent  to  the  average  of  P  and  P(_1*.  This  matrix, 
which  will  be  called  R,  will  be  shown  to  be  reversible. 

For  a  3  x  3  matrix,  the  average  of  P  and  p^_1*  is 


R  = 


'11 


p21p2 


2  P12  +  p^~  2 


P21P1 ■ 


2  lp21  "■  p 
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1  ,n  A  P13PK  1  .  P23P2 > 

2  P31  +  — p7’  2  p32  +  “p^ 


1  piipi 

or  in  general,  R^  =  j  (pij  +  -  p7>i) 


(p13  + 


p31p3,- 


'1 


1  P32P3, 

2  p23  *  “TT) 


■33 


To  verify  that  R  is  reversible,  it  will  be  shown  that  R^p^  = 


BiJpl  *  3  (plj  *  pJipj/Pl)Pi  =  J  lpiJpl  *  pJipj* 

RJlpj  *  3  <pji  *  piJpi/pj,pJ  =  3  (pjipJ  *  plJpl’  *  Rijpi 
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The  fact  that  VR  =  V  can  be  easily  shown  from  the  associative 


property  of  matrix  multiplication. 

For  the  example  used  so  far  in  this  chapter, 
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In  a  real  problem,  once  P  and  P*-1 ^  are  found,  a  statistical  test 
such  as  a  chi-squared  test  of  Anderson  and  Goodman  (1957)  could  be  used 
to  show  whether  p^_1)  is  significantly  different  from  P.  If  there  is 
no  significant  difference,  and  if  it  is  desired  to  make  the  simulated 
data  sequence  reversible,  using  R  as  the  transition  probability  matrix 
would  generate  a  reversible  simulated  sequence. 

Conditional  Probabilities  and  Invariance  of  Generation 

When  simulating  an  area  data  base  on  a  Cartesian-coordinate  type 
of  grid,  there  are  many  ways  to  generate  a  data  field.  It  is  desirable 
for  the  generation  process  to  produce  a  statistically  homogeneous  field. 
This  could  be  verified  in  terms  of  conditional  probabilities.  If  the 
field  is  homogeneous,  then  the  probability  distribution  at  a  grid 
point  (i+h+a,  j+k+b),  given  the  state  observed  at  point  (i+a,  j+b),  is 
the  same  regardless  of  the  values  of  a  and  b,  or  the  conditional  proba¬ 
bility  distribution  is  invariant  under  translation.  Also,  the  condi¬ 
tional  probability  at  point  (i,j)  given  the  state  at  point  (i+h,  j+k) 
should  be  the  same  regardless  of  the  variation  on  the  basic  generation 
scheme  used  (such  as  generation  by  rows  instead  of  by  columns).  This 
property  could  be  called  invariance  of  generation. 

In  the  case  of  a  basic  first-order  transition  from  one  point  to 
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the  next,  the  TPM  and  the  conditional  probability  matrix  (CPM)  are  the 
same,  since  p^j  =  P(Xj|x^).  The  CPM  for  the  transition  in  reverse  is 
P*"1'  because  p^"1^  =  P(xi|xj). 

The  conditional  probabilities  should  first  be  investigated  for 
first-order  schemes  of  generating  a  gridded  data  base,  as  shown  in 
figure  3.  The  conditional  probability  distribution  at  point  j,  given 
the  probability  distribution  at  point  i,  will  be  compared  for  all  four 
cases. 

Under  first-order  generation  schemes,  the  paths  from  one  point  to 
another  can  be  divided  into  three  types,  as  shown  in  figure  4,  and  the 
conditional  probability  structures  resulting  can  be  applied  to  the  cases 
depicted  in  figure  3.  Assume  that  the  transition  matrix  along  an  arrow 
is  P,  and  that  the  transition  matrix  in  reversed  direction  is  p' ^ . 

For  example,  a  type  2  structure  would  be  generated  by  the  transition 
matrix  obtained  from  P*-1^P. 

2 

For  a  type  1  situation,  the  total  transition  matrix  is  PP  =  P  , 

so  P ( xb I xa )  =  Pal5  •  Since  the  transition  matrix  for  type  2  is  given 

by  P^“^P,  matrix  multiplication  can  be  used  to  show  that  P(x.  lx  )  = 

N  ,  b  a 

l  pac  pcb’  For  the  P  raatrix  1136(1  as  311  example  so  far  in  this 
c=  1 

chapter,  the  conditional  probability  matrices  are 


".68 

.14 

.18“ 

TYPE  2 

"•72 

•  2S 

.10“ 

.52 

.30 

.18 

p(-l)p  . 

.36 

.30 

.34 

_  .04 

.14 

•  82_ 

*8 
_ i 

•  111 

.82_ 

Intuitively,  it  appears  that,  for  P(xb|xa) ,  type  3  and  type  2  are 
the  same,  since  the  unconditional  probability  distribution  at  point  c 
is  the  same  in  both  cases.  This  can  be  shown  to  be  true  by  finding  the 
conditional  probability  in  a  type  3  situation: 
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renal t Ion  Structures 


pUblv 


p<*a> 


N  N 

d=i  ?d  cii  PdcPcaPcb 


pa 


N  N  N 

pcapcb  pdpdc  pcapcbpc 

c=  i  d=  1  _  c=  i _ 


(note  1) 


N 


I  pac  papcbpc  N 


c=1 


pc  pa 


(note  2)  =  l  Pac'^Pcb 


c=l 


Note  1:  This  step  uses  V  =  VP.  If  each  transition  probability 

is  p.  ,  a  term-by-term  expansion  of  each  side  of  V  =  VP  gives  p  = 

N  ac 

j,  P“P*' 


Note  2:  This  step  substitutes  p^-1 '  Pa/Pc  for  Pca* 

This  shows  that  a  type  3  structure  has  the  same  conditional 
probabilities  as  type  2.  Therefore,  there  are  only  two  basic  first- 
order  structures,  shown  as  type  1  and  type  2.  These  two  structures 
may  be  generalized  as  in  figure  5,  where  r  backward  and  s  forward  steps 
occur  between  points  i  and  j. 


TYPE  1 


TYPE  2 


s  transit Iona 


r  a 

reverse  forward 

.  transitions  transitions 

< - < - * - • - =? - » - > 


Figure  5.  Final  Generalized  First-Order  Transition  Structures 
In  type  1,  the  overall  TPM  from  i  to  J  is  ps,  so  P(xjx^)  =  p^8'. 
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In  type  2,  the  overall  TPM  from  i  to  j  is  given  by  P(-r!ps,  so 


P(xj|xi) 


N 

_  V 

-  t 

k=1 


p  ("r,p  (s) 

pik  pkj  • 


The  four  sample  cases  given  in  figure  3  will  now  be  compared  in 
terms  of  conditional  probabilities,  with  the  P  matrix  given  earlier  in 
the  chapter  used  as  an  example.  The  overall  transition  matrices  are 
as  follows: 

Case  A:  (6  forward  steps)  P^ 

Case  B:  (2  reverse  and  4  forward  steps)  pl“2^ 

Case  C:  (1  reverse  and  3  forward  steps) 

Case  D:  (2  forward  steps)  P2  . 

The  numerical  values  of  the  matrices  are 


Case  A  Case  B 


“ . 465088  . 1 65984  .  368928  “ 

“.511168  .169056  .319776" 

.460992  .170080  .368928 

. 36268S  .170080  .467232 

202944  .165984  .631072_ 

_. 204992  .163936  .631072 _ 

Case  C 

Case  D 

”.5872  .1688  .2440 

.68  .14  .18 

.4144  .1880  .3976 

.52  .30  .18 

_ .  13703"  .15811  .7048 

_  .04  .14  .82 

This  counterexample  shows  that  at  least  the  suggested  first-order 
generation  schemes  are  not  invariant. 


Actually,  this  specific  example  is  unrealistic  since,  if  a  TPM  is 
not  reversible,  it  would  not  be  used  to  generate  data  in  all  directions. 
However,  if  some  reversible  matrix  R  is  used,  and  is  assumed  to  be 
constant  in  all  directions,  the  overall  transition  probability  matrix 
from  point  i  to  point  j  would  range  from  in  cases  A  and  B  to  R2  in 
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case  D,  and  the  conditional  probabilities  would  not  be  the  same. 

This  section  has  shown  that  at  least  some  first-order  generation 
schemes  on  a  two-dimensional  grid  are  not  invariant.  Since  these  were 
expressed  in  terms  of  processes  that  could  occur  on  a  line  or  in  a  time 
series,  this  demonstrates  that  present  Markov  chain  applications  are  not 
invariant  when  the  TPM  is  not  reversible.  This  is  not  a  problem  in 
most  applications  since  the  proper  matrix  would  be  used  in  each  direction 
of  simulation,  but  could  be  a  problem  in  two-dimensional  simulation 
since  using  even  reversible  matrices  does  not  guarantee  invariance. 

Correlation  Issues 

Along  with  the  issue  of  conditional  probabilities  when  generating 
a  two-dimensional  data  field,  it  is  useful  to  consider  the  correlation 
structure  of  the  data  and  see  if  this  structure  is  preserved  by  various 
simulation  processes. 

A  Markov  field,  as  defined  by  Besag  (1974),  assumes  that  the 
correlation  between  two  points  depends  only  on  the  distance  between 
them  and  not  on  the  direction.  More  specifically,  assume  that  the 
points  on  a  gridded  two-dimensional  data  field  (or  lattice)  are  denoted 
by  coordinates  (i,j)  relative  to  some  point  (0,0),  which  can  be  moved 
around  on  the  grid,  if  desired.  This  is  a  stationary  Markov  field  if 
(1)  the  probability  distribution  at  point  (i,j)  is  conditional  only  on 
a  specified  set  of  adjacent  points  such  as  (i-1,j),  (i,j-1),  (i+l,j), 
and  (i,j+1)  and  (2)  if  the  probability  distribution  is  the  same  for 
any  interior  point  on  the  lattice. 

One  way  of  describing  the  "order"  of  a  Markov  field  is  given  in 
figure  6,  although  terminology  could  differ.  The  probability 
distribution  at  point  0,  for  a  Markov  field  of  order  n,  depends  only  on 
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4  4  3  4  4 
4  2  12  4 

3  10  13 

4  2  12  4 

4  4  3  4  4 

Figure  6.  Order  of  Dependence  in  a  Gridded 
Two-Dimensional  Markov  Field: 

The  probability  distribution  at  point  0,  in  a  Markov 
field  of  order  n,  depends  only  on  the  data  values 
at  points  numbered  n  or  less. 

the  data  values  at  points  numbered  n  or  less.  Generally,  only  first- 
order  Markov  fields  have  been  considered  in  actual  investigations  be¬ 
cause  of  the  rapid  increase  in  complexity  with  higher  orders. 

Another  way  to  describe  the  correlation  structure  is  to  define  a 
"line  transect"  (Whittle  1954)  as  a  line  that  crosses  the  data  field. 

If  the  data  field  is  a  Markov  field,  the  process  on  the  line  is  a  one¬ 
dimensional  Markov  process. 

Using  actual  data,  the  correlation  structure  of  a  two-dimensional 
data  field  is  often  expressed  by  a  correlogram  in  the  form  of  a  table, 
whose  entries  are  the  correlations  between  the  data  value  at  a  relative 
point  (0,0)  and  the  values  at  other  points  with  coordinates  of  (i,j) 
on  this  relative  coordinate  system.  One  example,  based  on  orange  trees 
in  an  orchard  (Patankar  1954  and  Whittle  1954)  showed  a  correlation 
structure  of  values  gradually  decreasing  in  all  directions  away  from  the 
point  of  interest,  from  around  .54  to  .55  for  adjacent  points,  to  about 
.2  to  .3  for  points  3  to  12  units  apart.  The  correlation  of  a  point 
with  itself  (1.0)  was  a  "spike"  in  the  center  of  a  rounded  "mound." 

The  authors  remarked  that  this  appear  to  be  a  quite  common  situation  in 
analysis  of  gridded  data  bases.  The  fact  that  the  correlation  for 


71 


points  one  unit  or  more  apart  is  quite  a  bit  lower  than  1.0  shows  that 
there  is  a  substantial  amount  of  individual  variability,  while  the 
gradual  decline  with  distance  shows  that  there  is  still  a  considerable 
amount  of  structure  in  the  data  field.  If  the  correlation  decreases 
fairly  uniformly  in  all  directions  (such  as  r(a>b)  being  approximately 
equal  to  r(b^a),  where  r(a^  indicates  the  average  correlation  between 
points  with  coordinates  of  (i,j)  and  (i+a,  j+b),  for  all  possible  (i,j) 
points  on  the  grid),  there  is  no  significant  trend  in  the  data.  As 
stated  before,  any  known  trend  should  be  removed  from  the  data  before 
this  analysis  is  performed. 

If  it  is  assumed  that  the  states  are  interval  scaled  (such  as 
0,1,2,.  .),  it  is  possible  to  use  the  standard  formula  for  correlation 
to  derive  the  correlation  structure  of  a  one-dimensional  first-order 
process,  or  a  process  in  two  dimensions  considered  in  terms  of  one¬ 
dimensional  paths.  The  process  is  assumed  to  be  in  steady  state,  or 
VP  -  V.  The  correlation  derived  is  the  "expected  correlation,"  with 
the  unconditional  probabilities  distributed  according  to  V  and  the 
transition  probabilities  distributed  according  to  P.  The  standard 
correlation  coefficient  formula  is  given  by 

nI*y  -  Hy 

"(n[x2  -  ([x)2][nVy2  -  (Yy)2] 

In  this  formula,  x  can  be  considered  to  be  the  state  at  point  i. 

The  expected  number  of  observations  in  state  x±  in  n  observations  is 

npr 

Also,  y  is  the  state  at  point  j,  or  Xj.  Point  j  does  not  need  to 
be  adjacent  to  point  i.  Out  of  n  trials,  the  expected  number  of  cases 

with  state  x^  followed  by  state  x,  at  point  j  is  np.p.  .. 

J  i  i  J 
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The  values  of  the  terms  in  the  correlation  coefficient  formula 


can  be  calculated  as  follows.  There  are  N  states  and  n  observations. 

N 

E(Zx)  =  l  np.x. 
i=1  1  1 

N  N 

E(zy)  =  I  np.  I  p..x. 

i=1  1  j=1  J 

-  N 

E(Zbr)  =  l  np.x. 
i=1 

E(ry^)  =  l  np.  I  v.xf 
i=1  1  j=1  J 

N  N 

E(2xy)  =  l  np.x.  T  p  x 
i=1  1  1  j=1  J 

Next,  it  will  be  shown  that  ECCx)  =  E(Zy)  and  E(Z:x2)  =  E(Zy2) . 

This  is  intuitively  reasonable  because  the  same  unconditional  distribu¬ 
tion  is  maintained  before  and  after  the  transition.  The  last  step  of 
each  proof  used  the  fact  that  VP  =  V. 

Show  that  E(Zx)  =  E(Zy): 

N  N 

E(£y)/n  =  ^  Pt  ^  Pjj*j  =  <P,P„*,  *  P,P,j»2  ♦  -  •  •  *  P,P1NV 

+  ^P2P21X1  +  P2P22X2  +  '  *  +  P2^2NXN^ 

+  (pNpN1X1  +  PNPN2X2  +  PNPNNXN J 

=  (PlPn  ♦  P2P21  +  •  •  *  PNPN1>x1 

♦  < P^P-12  +  p2p22  +  **+  PNPM2)X2  +,,+(p1p1N  +  P2P2N+  *  ’  *PNPNN ' XN 

N 

--  Plx,  ♦  p2x2  «■  .  .  ♦  pNxN  .  pixi  =  EttxJ/n 
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Show  that  E(Ex2)  =  E(2y2): 


E(Ey2)/n  =  ^  PL  P^Xj  =  (P1P11x1  +  PiP;2x2  +  •  •  +  pipinxN  1 

+  (P2P2ixi  +P2P22x2  • • +P2P2NXN  )+” *+lpNPN1X1  +PNPN2X2  +  "+PNPNNXN  1 
=  (p1pn+p2P21+..+pNPN1)x1  +  .  .  +<P1P1N+P2P2N+--+PNPNN)XN 

2  2  r  2  *5 

=  P1x1  +  .  .  +  PNxN  =  ^  Pixi  =  E(2x^)/n 


Using  these  factors,  the  expected  correlation  coefficient  between 
points  i  and  j  becomes 


Note  that  all  factors  of  n  (the  number  of  observations)  canceled  out. 
Also,  both  terms  in  the  denominator  were  identical. 

The  correlation  can  be  generalized  to  cases  where  there  is  more 
than  one  transition  between  point  i  and  point  j  by  substituting  the 
correct  transition  probabilities  for  p^.  The  only  term  that  changes 
in  the  formula  is  the  left  term  in  the  numerator.  For  example,  in  case 
A  in  the  last  section,  the  correlation  coefficient  could  be  calculated 
by  substituting  Pjj^  for  p^.  Using  P  as  the  TPM  for  cases  A  through 
D,  these  and  other  correlations  are  calculated  as  follows.  Where 
X  =  1  1  2  3  |  ,  some  of  the  constants  needed  are  ( Ep^x^)  =  4.69^  and 
zpixi  =  5»5t  and  the  denominator  is  5.5  -  4.69^  =  .30^.  Let  r^s) 
indicate  the  correlation  between  points  s  forward  transitions  apart; 
let  indicate  the  correlation  between  points  r  reverse  transition 

apart;  and  let  r,  .  indicate  the  correlation  between  points  r  reverse 

\ “P  y  S  J 
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transitions  and  s  forward  transition  apart.  (This  notation  applies 
only  to  points  on  a  line,  and  this  notation  is  used  only  in  this  sec¬ 
tion)  .  Then,  some  typical  expected  correlations  can  be  calculated  as 


follows.  The  cases  (A  to  D)  are 

shown  in  figure 

3. 

Case 

Expected  Correlation  Coefficient 

One  transition,  forward 

E(r(i)) 

=  .81379 

One  transition,  reverse 

E(r(-1)> 

=  .81379 

Two  transitions,  forward  (case  D) 

E(R(2)) 

=  .65655 

Two  transitions,  reverse 

E(r(-2)> 

=  .65655 

One  step  reverse,  one  forward 

E(r(-1,1)) 

=  .68966 

Case  A 

E(rW 

=  .27104 

Case  B 

E(r(-2,4)) 

=  .3091S 

Case  C 

E(r(-1,3)) 

=  .45992 

It  is  interesting  to  compare  E(r^))  with  E(r^j).  It  would  be 
expected  that  Elr^j)  =  E(r(1j)n,  but  E(r^)2  =  .66226,  slightly  higher 
than  the  real  correlation.  Also,  case  A  should  have  a  correlation 
of  Efr^))^  .29046.  This  is  also  slightly  higher  than  the  real  corre¬ 
lation.  This  non-exponential  trend  of  correlation  is  confirmed  by 
Lloyd  (1974),  who  uses  an  eigenvalue  approach  for  computing  correlation. 
He  states  that  the  correlation  is  strictly  exponential  for  a  two-state 
Markov  chain,  but  that  the  trend  is  more  complicated  when  there  are 
more  than  two  states. 

Under  invariance  of  generation,  the  correlation  between  i  and  j  in 
cases  A,  B,  C,  and  D  should  be  the  same.  However,  they  range  from 
.2710  to  .6566,  using  P  as  the  example  of  a  transition  matrix.  The 
"true"  correlation  is  probably  approximately  equal  to  .6566,  the  corre¬ 
lation  that  results  from  taking  a  "shortest  path"  from  i  to  j,  as  in 
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case  D. 


It  is  not  difficult  to  prove  that  as  n  approaches  infinity, 


E(r(nj)  approaches  zero.  As  n  -*■  °°,  p^ 


(n) 


Pji  so 


l  P^i  l  Piixi  =  (  t  Pixj_ )  (  l  PjXj)  =  (  l  Pixi) 

i=1  1  j=1  J  i=1  1  j=1  J  J  i=1  11 


and  the  numerator  of  E(r,  ,)  would  be  zero. 

(00) 

The  expected  correlation  for  an  equal  number  of  steps  forward  or 
backward  is  equal,  as  expected.  However,  the  expected  correlation  for 
one  step  backwards  and  one  step  forward  (as  in  a  type  2  structure, 
mentioned  earlier  in  this  chapter)  is  not  the  same  as  the  expected 
correlation  for  two  steps  forward  or  backward.  This  shows  why  the 
correlation  is  not  the  same  for  case  A  as  for  case  B.  If  a  reversible 
TPM  is  used,  the  correlation  would  be  the  same  for  a  constant  number 
of  steps  regardless  of  the  combination  of  steps. 


Directionality  and  Inhomogeneity 

So  far  in  this  chapter,  it  has  been  assumed  in  many  cases  that  the 
transition  matrix  is  reversible  and  is  also  totally  nondirectional.  In 
many  cases,  deterministic  components  of  the  data  can  be  removed  and  the 
data  can  be  processed  so  that  it  is  homogeneous.  However,  in  some 
cases,  it  is  not  possible  or  appropriate  to  try  to  remove  these  nonran¬ 
dom  components.  An  example  could  be  occurrence  or  nonoccurrence  of 
snowfall  over  a  fairly  large  area.  Since,  in  general,  the  frequency 
(unconditional  probability)  of  snow  diminishes  toward  the  south,  the 
transition  probabilities  cannot  be  the  same  for  locations  on  a  north- 
south  line  as  for  locations  on  an  east-west  line.  It  is  useful  to 
discuss  some  of  the  consequences  of  directionality,  or  (as  used  here) 


76 


a  case  where  the  TPM  may  be  constant  from  east  to  west,  but  varies  from 
south  to  north.  No  substantial  results  will  be  derived  here,  but  some 
data  handling  methods  will  be  discussed. 


A  simple  way  to  consider  inhomogeneity  that  causes  a  gradual 
trend  over  an  area  from  north  to  south  is  to  assume  that  simulation  will 
take  place  only  in  "eastward"  and  "southward"  directions  starting  from 
the  "northwest"  comer  of  the  area.  Simulation  would  not  take  place 
as  in  cases  A  or  D  shown  in  figure  3.  Let  the  transition  matrix  for 
"eastward"  transitions  be  P,  and  let  the  TPM  for  "southward"  transitions 
be  Q. 

Assume  that  a  "shortest  path"  is  taken  from  (0,0)  to  (i,j), 
where  lexicographical  ordering  is  used,  so  i  indicates  the  number  of 
rows  down  ("south")  from  (0,0)  and  j  indicates  the  number  of  columns 
to  the  right  ("east")  of  (0,0).  Also  assume  that  a  one-dimensional  first- 
order  method  of  generation  is  used.  An  example  will  show  that  there 
is  a  difference  in  transition  probabilities  even  when  there  are  only 
three  transitions  between  (0,0)  and  (1,2).  Let  the  transition 
probability  matrices  be 


".56 

.44“ 

“.31 

.69" 

and  Q  = 

_  .33 

•67_ 

_  .09 

•  91  _ 

The  three  possible  cases  are  shown  in  figure  7. 
The  matrices  for  the  transitions  in  figure  7  are 


CASE  E 

QPP  = 

" .422299 

.577701" 

_. 410661 

.589339. 

CASE  F 

POP  = 

“.379036 

.620964“ 

_ .367398 

.  632602 _ 

Figure  7.  Three  Possible  Paths  for  a  Two-Dimensional 
Simulation  by  One-Dimensional  Processes 

[.190936  . 809064 

.179298  .820702 

In  general,  the  same  transition  probabilities  do  not  result  if 
the  transition  matrices  P  and  Q  are  different,  because  matrix  multi¬ 
plication  is  not  conmutative.  However,  most  of  the  available  theory 
that  can  be  applied  to  two-dimensioned  situations  assumes  homogeneity 
and  this  problem  will  not  occur  if  the  data  is  homogeneous.  The  data 
that  is  used  as  an  example  in  chapter  VII  is  very  nearly  homogeneous, 
so  this  problem  will  not  be  considered  further  in  this  thesis. 

Another  type  of  inhcmogeneity  that  is  frequently  encountered  is  a 
fairly  sharp  division  from  one  area  to  another.  This  could  occur  at  a 
topographic  boundary  such  as  a  coastline,  or  along  a  mountain  range. 
Since  attempting  to  handle  this  exactly  using  theory  increases  com¬ 
plexity  greatly,  in  this  thesis  it  will  be  assumed  that  investigations 
are  confined  to  areas  that  are  homogeneous.  In  case  of  an  inhomogeneity, 
an  analogous  approach  to  that  of  Lloyd  (197 A)  can  be  taken:  to  use  a 
"generalized  stationarity"  that  allows  the  entire  field  of  interest  to 


be  divided  into  areas  that  can  be  considered  homogeneous.  Transition 
matrices  would  be  developed  from  the  data  for  each  separate  area,  and 
the  complete  data  field  could  be  simulated  by  using  the  appropriate 
transition  matrices  for  each  area.  Effects  at  the  boundary  between 
areas  could  be  considered  insignificant,  in  general. 


VI.  Literature  Survey  - 
Simulations  of  Two-Dimensional  Data  Bases 

Non-Markov  Models 

There  have  been  very  few  attempts  to  simulate  two-dimensional 
data  bases.  This  section  will  discuss  regression- type  simulations, 
which  are  not  exactly  Markov  in  some  cases.  In  a  two-state  case,  the 
two  models  can  be  equivalent.  As  described  in  chapter  V,  all  coordi¬ 
nates  of  points  will  assume  a  coordinate  system  in  "lexicographical 
ordering." 

Galbraith  and  Walley  (1976,  15-2)  describe  a  regression  scheme 
where  the  expected  value  at  a  point  depends  only  on  the  value  at  the 
two  nearest  points  that  are  known.  (For  a  simulation  starting  at  the 
"northwest"  corner  and  proceeding  "eastward"  and  "southward",  these  two 
known  points  are  immediately  "north”  and  "west”  of  the  unknown  point). 
The  unknown  point  is  (i,j)  and  the  known  points  are  ( i— 1 , j )  and  ( i , j— 1 ) - 
Then  E(x.  .  I all  predecessors)  =  a  +  bx.  .  +  cx.  .  .  +  dx.  .  .x.  .  . , 
where  a,  b,  c,  and  d  are  found  by  linear  regression  methods.  The 
interaction  term  may  be  deleted  for  simplicity.  However,  for  a  two- 
state  chain,  keeping  the  interaction  term  in  will  allow  all  transition 
probabilities  in  terms  of  the  values  at  points  (i, j-1 )  and  ( i— 1 , j)  to 
be  specified  as  follows: 

P(1|0,0)  =  a  =  Ptx^  =  =  0»  xi-i,j  =  0) 

P(1|0,1)  =  a  +  c 
P(  1 1 1 ,0)  =  a  +  b 
P(  1 1 1 , 1 )  =  a  +  b  +  c  +  d 

To  use  the  above  equation  for  simulation  of  a  two-state  (0  or  1) 
Markov  chain,  either  with  or  without  the  interaction  term,  the  value  of 
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E(x^j)  is  computed  and  a  random  number  between  0  and  1  is  drawn.  If 
the  random  number  is  greater  than  1  -  E(x^j),  the  state  at  (i,j)  is  1 
and  otherwise  it  is  0. 

Because  of  the  limited  (and  fixed)  number  of  parameters,  this 
procedure  cannot  be  used  with  more  than  two  states,  except  as  an  approx¬ 
imation.  A  possible  simulation  method  would  be  to  compute  E(x^),  add 
an  error  term  (also  derived  from  the  data)  and  use  some  procedure  for 
converting  the  resulting  value  into  one  of  the  permitted  states. 

Welberry  and  Galbraith  (1973)  give  examples  of  a  two-state  re¬ 
gression  model  for  crystal  growth  (where  0  =  absence  of  an  atom,  and  1  = 
presence  of  an  atom).  They  assumed  that  there  were  no  interactions,  so 
that  they  could  obtain  some  exact  theoretical  results. 

These  papers  showed  that  a  major  problem  is  to  statistically  de¬ 
rive  the  expected  value  of  the  variable  of  interest,  or  equivalently, 
the  stationary  distribution.  For  a  one-dimensional  chain,  the  stationary 
distribution  is  obtained  by  solving  VP  =  V  and  the  expected  value  can 
be  easily  obtained  from  V.  However,  Galbraith  and  Walley  ( 1976)  showed 
that  when  an  interaction  term  is  included  in  the  regression  equation, 
there  are  always  more  unknowns  than  equations,  so  that  the  expected 
value  cannot  be  found.  In  this  thesis,  the  transition  probabilities  and 
other  constants  are  found  from  the  data  by  counting  transitions,  and 
the  expected  values  and  stationary  distributions  can  be  also  obtained 
from  the  data  by  the  same  type  of  process. 

Matalas  (1967)  describes  a  "multivariate  weakly  stationary  gener¬ 
ating  process"  that  could  be  easily  adapted  to  a  two-dimensional  data 
base.  This  method,  since  it  was  developed  to  handle  a  time  series  of 
several  interrelated  variables,  would  generate  either  a  row  or  column 


81 


of  data  values  simultaneously.  The  cross-correlations  in  and  between 
the  row  or  column  being  generated  and  the  last  row  or  column  generated 
could  be  accounted  for.  However,  this  procedure  could  not  handle  a 
Markov  field  of  order  greater  than  two  because  correlations  are  con¬ 
sidered  only  in  two  adjacent  rows  or  columns. 

Markov  Simulations  in  Two  Dimensions 

The  following  suggested  procedures  differ  from  the  "original  vec¬ 
tor  Markov  chain  approach"  (Pickard  1977)  but  they  have  Markov  prop¬ 
erties  since  the  next  computed  value  depends  only  on  certain  previously- 
computed  data  values  and  not  on  the  entire  process.  The  original 
vector  approach,  for  all  Markov  chains,  requires  the  input  and  output 
state  spaces  to  be  the  same.  For  example,  in  a  second-order  Markov 
chain,  overlapping  states  are  defined,  where  the  two-component  vector 
state  consisting  of  the  most  recent  and  present  observations  in  sequence 
define  the  probability  distribution  for  transition  to  the  two-component 
vector  state  containing  the  present  and  next  observations  in  order.  If 
overlapping  states  are  not  required,  the  transition  is  to  the  single 
state  consisting  of  the  next  observation.  In  this  case,  the  transition 
probability  matrix  is  not  square,  but  is  x  N.  All  of  the  elements 
of  this  matrix  may  be  nonzero  since  there  are  no  impossible  transitions. 

The  following  proposed  two-dimensional  Markov  schemes  will  consist 
of  some  vector  sequence  of  known  observations  that  determine  the  prob¬ 
ability  distribution  for  the  next  observation.  In  all  of  these  cases, 
two  adjacent  boundaries  of  the  data  field  are  generated  by  one-dimension¬ 
al  processes  of  some  kind.  The  remaining  points  (including  the  other 
two  boundaries)  can  be  considered  "interior  points,"  which  are  generated 
by  two-dimensional  processes. 


Galbraith  and  Walley  (1976)  consider  a  Markov-type  simulation 
procedure  analogous  to  their  regression  model  with  interaction  terms. 
This  approach  was  suggested  earlier  by  Bartlett  (1967),  who  called  the 
process  a  "doubly  stochastic  Markov  chain."  In  this  procedure,  after 
the  top  and  left  (or  any  two  adjacent)  boundaries  are  generated  by  one¬ 
dimensional  processes,  the  "interior"  data  values  are  generated  by 
rows  (or  columns).  The  transition  matrix  for  transition  to  a  point 
(i,j)  is  based  on  the  state  at  point  (i-1,j),  immediately  to  the  left, 
but  a  different  transition  matrix  is  used  for  each  possible  state  at 
point  (i,j-1),  immediately  above  (i,j).  This  could  be  consolidated 
into  one  non-square  matrix  if  the  input  states  are  the  vector  states 
corresponding  to  the  observations  to  the  left  of  and  above  each  point. 

These  authors  were  not  able  to  develop  exact  formulas  for  the 
expected  value  E(x^j),  where  in  a  two-state  binary  chain,  E(x^)  is 
the  proportion  of  the  time  that  the  state  is  1,  except  in  special  cases. 
Galbraith  and  Walley  (1976)  numerically  solved  a  conjectured  general 
solution  for  E(x.  .)  in  a  two-state  case,  and  the  numerical  results 
closely  approximated  the  conjectured  values. 

To  simulate  using  this  procedure,  the  total  number  of  probabilities 
needed  is  N  (unconditional  probabilities)  +  N2  (first-order  Markov 
transition  probabilities  for  generation  of  boundary  values,  assuming 

3 

the  same  probabilities  in  both  horizontal  and  vertical  directions)  -*•  N 

3  2 

(conditional  probabilities  for  interior  points)  =  N  +  N  +  N.  The 
number  of  independent  probabilities  is  N2  -  1. 

Pickard  (1977)  developed  a  similar  procedure,  which  generates  a 
homogeneous  Markov  random  field.  His  model  is  based  on  the  joint 
distribution  for  the  vertices  of  a  unit  square  where,  for 
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example,  A,  B,  and  C  are  known  and  D  is  unknown.  Then,  for  interior 

points,  the  probability  distribution  at  D  is  based  on  the  vector  state 

(A,B,C).  As  in  Galbraith  and  Wally's  paper,  his  calculations  apply  only 

to  two-state  processes,  although  his  procedures  can  be  extended  to  (1) 

nonhomogeneous  data  bases,  (2)  more  than  two  states,  or  (3)  more  than 

two  dimensions.  The  number  of  probabilities  required  in  this  model  is 
2  4 

N  +  N  (as  above)  +  N  (conditional  probabilities  for  D,  based  on  A,  B, 

4  2  4  3  2 

and  C)  =  N  +  N  +  N,  with  N  -  N  +  N  -  1  independent  probabilities. 

Pickard  also  proved  that  the  conditional  probability  at  a  point 
(i  +  k,  j  +  1)  based  on  the  state  at  (i,k)  is  the  same  regardless  of 
the  path  taken  as  long  as  it  is  one  of  the  shortest  paths.  The  arguments 
used  to  prove  this  also  led  to  an  explicit  formula  for  the  correlation, 
which  is  geometric  when  there  are  only  two  states. 

Another  Possible  Procedure  -  "Method  1" 

The  following  suggested  simulation  procedure  is  illustrated  by  the 
diagram  in  figure  8.  Where  an  arrow  is  shown,  the  value  at  the  head 
of  the  arrow  is  determined  by  first-order  one-dimensional  Markov  pro¬ 
cesses  based  on  the  value  at  the  tail  of  the  arrow.  Where  a  square  is 
shown,  the  values  at  the  two  lower  corners  are  obtained  from  the 
probability  distribution  based  on  the  values  at  the  two  upper  corners 
of  the  square.  This  procedure  requires  an  x  matrix,  plus  a  first- 
order  matrix,  and  a  vector  of  unconditional  probabilities,  so  the  num¬ 
ber  of  probabilities  requires  is  the  same  as  in  Pickard's  (1977)  model. 
This  method  of  simulation  will  be  called  "Method  1."  A  possible 
advantage  of  this  procedure  is  that  fewer  random  numbers  would  need 
to  be  generated  to  produce  a  data  field  of  a  given  size. 
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VII. 


Examples  and  Discussion  of  Two-Dimensional 
Data  Base  Simulations 


Data  Base:  Selection,  Analysis,  and  Characteristics 

Since  no  published  applications  of  Markov  processes  to  a  two- 
dimensional  data  base  have  been  found,  although  the  theory  has  been 
established,  it  was  decided  that  an  example  would  demonstrate  the  use¬ 
fulness  of  the  proposed  methods,  and  reveal  if  there  are  any  signifi¬ 
cant  operational  problems.  The  computer  calculations  were  found  to  be 
quite  simple,  and  none  of  the  programs  required  more  than  about  ten 
seconds  on  a  CYBER  175  computer  to  calculate  transition  probabilities. 

The  data  base  chosen  for  this  example  was  obtained  from  weekly 
maps  of  temperature  departures  from  the  19^1-70  normal,  published  in 
issues  of  the  Weekly  Weather  and  Crop  Bulletin  (USDA) .  A  series  cf  maps 
was  selected  for  analysis,  consisting  of  the  15  weekly  naps  starting  with 
the  first  week  (Monday  through  Sunday)  ending  in  December,  for  the 
seven  winters  1972-73  to  1978-79.  A  grid  with  12  rows  and  15  columns 
was  superimposed  on  an  area  of  the  eastern  and  central  United  States 
(see  figure  9).  The  180  variables  were  always  considered  in  lexico¬ 
graphical  order,  with  the  rows  considered  from  top  to  bottom  and  the 
columns  considered  from  left  to  right.  In  the  computer  programs,  the 
variables  were  considered  as  a  single  sequence  of  180  variables,  but 
in  other  cases  the  variables  were  denoted  by  a  letter  denoting  the  row 
and  a  number  indicating  the  column  (such  as  C12),  or  by  a  pair  such  as 
(r,c),  where  r  indicates  the  row  and  c  indicates  the  column.  Under 
this  notation,  the  vector  pair  is  the  same  as  a  90  degree  clockwise 
rotation  of  a  Cartesian  coordinate  system. 

The  temperature  departures  were  read  from  the  map  at  each  grid 
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Figure  9  -  Typical  Weather  Map  of  Weekly  Temperature  Departures  from  Normal, 

with  Superimposed  Grid 


intersection.  It  was  assumed  that  the  area  was  sufficiently  homogeneous 
for  these  analyses.  Most  of  the  inhomogeneity  was  removed  by  converting 
the  data  values  to  differences  from  normal  temperatures.  With  a  total 
of  105  maps,  there  were  16900  data  points.  These  data  values  were 
stored  on  a  computer  file.  For  example,  the  stored  data  values  for  the 
period  covered  by  the  map  in  figure  9  are  shown  in  figure  10.  The  line 
number  consists  of  two  digits  each  for  the  month,  day,  year,  and  row 
number,  where  the  date  given  is  the  ending  date  of  the  weekly  period. 

After  the  data  base  was  stored,  statistical  characteristics  of  the 
data  were  examined  using  the  Statistical  Package  for  the  Social  Sciences 
(SPSS).  These  included  the  mean,  standard  deviation,  skewness,  maximum, 
and  minimum  for  each  grid  point,  and  are  shown  in  figures  14  through  18. 
While  there  are  variations  on  every  map,  the  data  obviously  has  an 
organized  structure,  and  there  are  meteorological  explanations  for  most 
of  the  variations.  Actually,  this  seven-year  period  may  not  have  been 
typical  because  of  the  large  number  of  extremes  recorded  (especially 
exceptionally  cold  winters),  and  even  a  typical  seven-year  period  is 
not  long  enough  to  describe  the  statistical  characteristics  of  the  data 
with  high  accuracy.  However,  this  data  base  is  sufficiently  realistic 
for  demonstration  purposes. 

To  produce  a  more  uniform  data  base,  the  data  values  were  stand¬ 
ardized  to  the  number  of  standard  deviations  from  the  average  for  each 
grid  location.  The  seven-year  average  was  used  rather  than  the  1991-70 
normal,  since  many  locations  were  consistently  colder  than  normal  during 
these  seven  winters.  Figure  11  shows  the  standardized  data  obtained 
from  the  data  in  figure  10. 

For  Markov  analysis,  the  data  was  categorized  into  three  and  five 
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fron  the  Average  for  Each  Grid  Point 


Standard  Deviation  of  Temperature  Departures  from  Average 
(.001  Degree  Fahrenheit). 


Plgure  1 6  -  Statistical  Characteristics  of  Data  Base  for  Each  Grid  Point 

Based  on  105  Cases: 

Skewness  (.001  Degree  Pahrenhelt). 
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Figure  i?  -  Maximum  Positive  Temperature  Departure 
from  Normal  for  Each  Grid  Point  (Based  on  105  Cases) 


Figure  18  -  Maximum  Negative  Temperature  Departure 
from  Normal  for  Eaoh  Grid  Point  (Based  on  105  Cases) 


states,  with  state  1  indicating  the  coldest  temperatures .  In  the  three- 
state  case,  the  divisions  between  states  were  at  -.75  and  ♦.75  standard 
deviations  from  average.  In  the  five-state  case,  the  divisions  were  at 
-1.5,  -.5,  *.5,  and  +1.5  standard  deviations.  Most  of  the  data  analysis 
was  carried  out  for  both  the  three-state  and  five-state  cases,  al¬ 
though  only  five-state  simulations  were  performed.  The  five-state  and 
three-state  data  fields  obtained  from  the  data  in  figure  10  are  shown 
in  figures  12  and  13  respectively,  and  additional  typical  real  data 
fields  are  shown  in  figures  19  and  20. 

The  correlation  structure  of  the  data  was  also  examined  using  SPSS. 
There  are  many  different  ways  of  displaying  the  relationships,  and  a 
few  examples  are  shown  in  figures  21  through  2S.  All  correlations  were 
based  on  the  original  data,  not  on  the  data  as  reduced  to  three  or  five 
states. 

Figure  21  is  a  correlogram,  showing  the  average  correlation  through¬ 
out  the  grid  for  all  points  no  more  than  four  rows  or  columns  from  an 
arbitrary  point.  Basically,  the  table  value  in  position  ii,j)  is  the 
autocorrelation  coefficient,  or  average  correlation,  between  all  pa-rs 
of  points  with  grid  positions  (r.c)  and  (r  ♦  i,  c  ♦  j).  For  example, 
the  second  number  in  the  third  row  (i  =  -2,  j  =  -3)  represents  an  auto¬ 
correlation  coefficient  of  .3567,  which  is  the  average  correlation 
between  all  pairs  of  points  where  the  first  point  is  two  grid  intervals 
north  and  three  grid  intervals  west  of  the  other.  Since  the  second 
point  is  three  intervals  east  and  two  intervals  south  of  the  first,  the 
correlogram  is  symmetric  about  the  origin  and  the  correlation  at  i  =  2, 
j  =  3  is  also  .3567. 

The  correlation  contours  are  nearly  circular  ellipses,  with  the 


Same  Data  as  in  Figure  19  ,  in  Terms  of  Three  States 
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Plgure  22  -  Correlosrans  for  Selected  Grid  Points  In  the  Data  Ease 
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Figure  25:  Correlation  Between  Points  Three  Horizontal  Units  Apart 

(Leading  Decimal  Point  Omitted). 

Example:  Upper  Left  Value  Is  Average  Correlation  Between  Points  A1  and  A4 

Based  on  105  Cases. 
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Plgure  2?  -  Correlation  Between  Points  Separated  by  Two  Rows  and  Two  Columns 
In  a  Northwest  -  Southeast  Direction  (Leading  Decimal  Point  Omitted). 
Example:  Upper  Left  Value  Is  Average  Correlation  Between  Points  A1  and  C3 

Based  on  105  Cases. 
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Correlation  .88  or  less  Correlation  at  Least  .92  £ ^ 

Figure  28  -  Correlation  Between  Points  Separated  by  Two  Rows  and  Two  Columns 
In  a  Northeast  -  Southwest  Direction  (Leading  Decimal  Point  Omitted). 
Example:  Upper  Left  Value  Is  Average  Correlation  Between  Points  A3  and  Cl 

Based  on  103  Cases. 


long  axes  oriented  mostly  from  southwest  to  northeast,  which  is 
approximately  parallel  to  the  average  winter  wind  direction  in  the 
middle  atmosphere  (the  level  that  most  strongly  influences  the  movement 
of  air  masses)  in  the  central  and  eastern  United  States. 

Figure  22  shows  the  correlation  structure  for  the  corner  points 
and  a  center  point  of  the  grid.  The  higher  correlations  to  the  south¬ 
east  of  point  A1  (western  North  Dakota!  correspond  to  the  prevailing 
northwesterly  winds  and  frequent  Arctic  air  masses  there.  The  pattern 
around  point  F8  (near  St.  Louis)  reflects  the  tendency  of  many  air 
masses  to  reach  the  area  from  either  the  southwest  or  northwest,  and  to 
leave  the  area  by  moving  up  the  Ohio  Valley. 

Figures  23  to  23  are  presented  to  show  that  there  are  no  major 
discontinuities  in  the  data.  A  discontinuity  would  be  shown  by  a  fairly 
sudden  decrease  in  the  correlation  coefficient  over  a  short  distance. 
Possibly  the  most  significant  discontinuities  are  in  the  vicinity  of 
Michigan  (due  to  the  modifying  effect  of  the  Great  Lakes)  and  on  the 
eastern  slope  of  the  Rockies  (due  to  greater  frequency  of  Chinook  winds 
in  the  western  parts  of  the  area),  but  the  effects  on  correlation  are 
fairly  small  and  are  no  greater  than  the  variations  in  correlation  over 
the  lower  Mississippi  Valley,  where  the  variations  are  due  to  the  pre¬ 
vailing  movement  of  air  masses  and  not  to  any  topographical  features. 
Therefore,  it  appears  reasonable  to  neglect  the  inhomogeneities  since 
they  have  only  minor  effects. 

Figures  23  and  24  show  correlations  between  points  about  100  miles 
apart  (with  all  correlations  exceeding  .95),  and  figures  25  to  2S  show 
the  correlation  between  points  about  300  miles  apart  (with  all  correla¬ 
tions  exceeding  .3).  As  mentioned  in  chapter  V,  the  autocorrelations 


in  the  data  field  considered  by  Patankar  (1954)  and  Whittle  (1954) 
ranged  from  about  .55  to  .2. 

The  transition  structure  of  the  real  data  will  be  briefly  dis¬ 
cussed  here  even  though  the  details  of  the  process  of  obtaining  trans¬ 
ition  matrices  and  a  further  discussion  of  the  results  will  be  given 
in  following  sections.  In  general,  when  five  states  were  used,  no  cases 
where  a  transition  skipped  a  state  between  horizontally  adjacent  points 
were  found,  and  only  two  cases  were  found  (from  state  3  to  5,  one  case 
in  Oklahoma  on  the  map  of  1  -  7  March  1976,  and  the  other  case  in 
North  Carolina  on  the  map  of  31  December  1973  to  6  January  1974)  in  a 
vertical  direction.  Of  course,  a  larger  data  base  may  reveal  more 
instances  of  such  transitions,  but  based  on  the  nature  of  the  tempera¬ 
ture  data  used,  such  occurrences  should  always  be  unusual.  Any  transi¬ 
tion  where  the  states  in  two  horizontally  or  vertically  adjacent  points 
are  not  equal  or  consecutive  will  be  called  an  "unusual  transition  of 
type  1." 

It  also  was  found  that  unequal  or  nonconsecutive  states  in  diagon¬ 
ally  adjacent  points  were  unusual.  Examples  of  such  combinations  are 
5  4 

2  and  2  *  To  devel°P  transition  matrices  for  Pickard's  method,  all 

1  2 

of  the  possible  combinations  of  states  in  a  "unit  square,"  such  as  ^  1 , 

are  counted.  By  examining  these  frequency  counts,  only  four  cases  were 

123232  34  43  43 

found,  one  each  of  2  3’  2  1  *  43’  and  4  5*  The  cases  of  4  5  and  5  5* 

3  4 

and  the  two  cases  °f  5  5  actually  represented  the  two  cases  of  "unusual 
transitions  of  type  1"  mentioned  in  the  last  paragraph.  If  the  states 
at  two  diagonally  adjacent  points  are  not  equal  or  consecutive,  and  if 
no  "unusual  transition  of  type  1"  occurs  in  the  unit  square  containing 
the  points,  this  is  called  an  "unusual  transition  of  type  2." 
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It  is  also  useful  to  determine  the  number  of  occurrences  of  un¬ 
equal  and  nonconsecutive  states  in  a  "unit  diamond,"  an  area  such  as 
2  2 

1  ,  •  This  will  be  discussed  more  fully  in  describing  "Method  2"  of 

simulation.  Other  orientations  of  such  a  diamond-shaped  area  will  not 
be  considered  here  because  of  the  process  of  simulation  used  in  this 
thesis.  If  the  states  at  the  ends  of  the  unit  diamond  are  not  equal 
or  consecutive,  and  if  no  type  2  or  type  1  transition  occurs  in  either 
of  the  unit  squares  that  contains  three  points  from  the  unit  diamond, 
this  is  called  an  "unusual  transition  of  type  3."  For  example,  ^  ^  w 

A  3 

would  be  an  unusual  transition  of  type  3,  while  c  ^  would  be  an  un- 

j  J> 

usual  transition  of  type  1.  Based  on  the  data  Isee  the  frequency  counts 
in  Appendix  E) ,  there  were  eight  unusual  transitions  of  type  3,  plus 
five  more  of  types  2  or  1 .  Many  of  these  occurrences  were  on  two  con¬ 
secutive  maps,  those  of  5-11  and  12— 1 S  February  1979,  although  three 
geographical  areas  were  involved. 

To  surttaarize,  three  specific  types  of  "unusual  transitions"  were 
defined  and  their  occurrences  in  the  actual  data  were  counted.  Since 
these  "unusual  transitions"  were  rare  in  -he  real  data,  a  realistic  sim¬ 
ulation  method  based  on  this  data  base  should  not  frequently  generate 
such  transitions. 

First -Order  Markov  Transition  Matrices 

FORTRAN  programs  were  written  to  count  the  number  of  transitions 
in  both  horizontal  and  vertical  directions,  for  three  and  five  states. 
These  transition  matrices  and  some  additional  data  ; which  will  be  dis¬ 
cussed  in  this  section)  are  shown  in  appendix  A.  Where  three  states 
were  used,  no  unusual  transitions  of  type  1  »from  state  1  to  3,  or  from 
3  to  1  in  horizontally  or  vertically  adjacent  points)  occurred.  As 
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mentioned  before,  two  unusual  transitions  of  type  1  occurred  in  a 
north-south  direction  when  the  data  was  divided  into  five  states,  but 
none  were  observed  in  a  horizontal  direction.  In  both  cases,  the  hori¬ 
zontal  and  vertical  transition  matrices  were  quite  similar,  with  most 
transition  probabilities  agreeing  within  .02.  No  consistent  pattern  of 
deviations  was  noticed. 

In  each  program,  the  unconditional  equilibrium  probabilities  were 
calculated  from  the  transition  probability  matrix  by  solving  VP  =  V, 
and  requiring  the  sum  of  the  probabilities  to  be  1.  (This  was  done  by 
expressing  the  problem  in  terms  of  simultaneous  equations,  and  using 
the  IMSL  subroutine  LEQT1F  to  solve  the  equations).  The  actual  uncon¬ 
ditional  and  calculated  equilibrium  probabilities  were  quite  similar, 
although  a  few  values  differed  by  over  .02.  ('/hen  comparing  horizontal 
with  vertical  transitions,  the  actual  probabilities  agreed  better  than 
the  calculated  probabilities.  No  consistent  pattern  of  deviations  was 
noticed  in  this  case  either,  although  the  cold  states  were  frequently 
calculated  to  occur  more  often  than  they  actually  occurred,  and  the 
warm  states  usually  were  calculated  to  occur  less  often  than  they 
actually  were  observed. 

Since  the  states  were  set  up  in  terms  of  standard  deviations  from 
average,  the  theoretical  frequency  of  each  state  could  be  calculated. 

In  all  cases,  the  middle  state  occurred  a  little  less  frequently  than 
expected.  Generally,  both  the  warmest  and  coldest  states  were  slightly 
more  frequent  than  expected,  except  that  when  five  states  were  used, 
state  5  occurred  a  little  less  frequently  than  the  theoretical  fre¬ 
quency.  The  fact  that  most  extremes  occurred  more  often  than  expected 
is  verified  by  the  fact  that  the  kurtosis  in  most  areas  was  lower  than 
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the  expected  value  for  a  standard  normal  distribution. 

The  reversibility  of  each  matrix  was  checked  to  help  verify  homo¬ 
geneity  or  inhomogeneity  of  the  data.  Only  the  five-state  vertical 
transition  matrix  shewed  any  difference  when  reversed,  at  least  to  the 
eighth  decimal  point.  This  matrix  cannot  be  reversible,  since  a 
transition  from  state  3  to  state  5  l from  north  to  south)  becomes  a 
transition  from  state  5  to  state  3  in  a  northward  direction.  In  this 
case,  both  the  reversed  and  reversible  transition  probability  matrices 
are  shown,  and  the  differences  are  .002  or  smaller. 

To  verify  the  intuitive  supposition  that  simulating  a  two-dimensional 
data  base  by  a  one-dimensional  method  will  produce  unrealistic  re¬ 
sults,  a  demonstration  was  performed  which  is  shown  in  figure  29.  Either 
the  left  side  of  the  data  base  was  simulated,  and  then  the  rows,  or  the 
top  edge  was  simulated,  and  then  the  columns. 

Transition  Matrices  Based  on  Galbraith  arid  Walley’s  Method 

All  configurations  of  states  in  the  form  ^  j!  were  counted  by  a 
FORTRAN  program  to  develop  transition  matrices  for  Galbraith  and  Wal ley's 
1 1976 )  method.  As  mentioned  in  chapter  VI,  this  method  bases  the 
probability  distribution  for  state  K  on  the  combination  of  states  H,J). 
The  transition  matrices  are  presented  for  both  three  and  five  states 

in  appendix  B.  Only  three  "unusual  transitions"  were  found,  one  case 

3  73 

of  ^  2  (type  2),  and  one  case  each  of  ^  ^  and  <-  s  (type  1),  since  only 

a  portion  of  the  unit  square  was  considered. 

A  sample  of  four  simulations  performed  using  Galbraith  and  (valley's 
method  is  shown  in  figure  30.  The  top  and  left  borders  were  generated 
using  the  first-order  transition  matrices.  In  general,  the  simulated 
data  fields  are  fairly  good,  and  it  would  be  possible  to  postulate 
realistic  weather  conditions  that  would  generate  such  temperature 
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Data  Bases  Generated  by  First-Order 
Horizontal  (H)  Karkov  Processes 


Plgure  30  -  Typical  Data  Bases  Generated  by  Galbraith  and  Valley's  Method 


patterns.  So,  this  method  could  be  considered  a  fairly  good  method, 
and  it  is  simple  to  perform  on  a  computer. 

Only  five-state  simulations  were  performed,  to  better  evaluate  the 
results  and  reveal  operational  problems  connected  with  the  simulations. 
One  immediately  apparent  result  was  that  it  was  possible  to  generate 
transitions  that  were  not  observed  in  the  original  data,  and  that  unus¬ 
ual  transitions  were  generated  quite  frequently.  With  this  method, 
mostly  "unusual  transitions  of  type  2"  were  generated.  Mien  developing 
the  transition  matrix,  if  there  were  no  occurrences  of  a  particular 
combination  of  input  states,  the  probabilities  were  all  specified  to  be 
cero.  The  method  chosen  to  deal  with  the  problem  in  this  case  is  to 
set  K  =  (1  ♦  J)/2,  (using  the  configuration  »  j|),  if  (I  ♦  J)/2  is  an 
integer,  and  to  randomly  select  from  the  integer  values  immediately 

above  and  below  (I  ♦  J)/2  if  it  is  not  an  integer.  Then  ^  *"  would  lead 

2  1  11 
to  ^  2  a,Kl  ^  would  lead  to  ^  ^  or  ^ 

Transition  Matrices  Based  on  Pickard's  Method 

To  calculate  the  transition  matrix  for  Pickard's  ( 1977'  method,  all 

occurrences  of  combinations  of  states  in  a  unit  square  were  counted. 

J  K 

In  this  section,  ^  L  will  denote  the  states  in  the  respective  positions 
of  the  square.  The  computer  program  which  performed  these  calculations 
is  shown  in  appt  idix  C  (for  the  five-state  case),  along  with  both  three- 
state  and  five-state  transition  matrices.  Also,  the  "cumulative  trans¬ 
ition  probability  matrix”  is  shown  for  the  five-state  case.  Such  a 
matrix  would  probably  be  the  easiest  way  to  perform  a  simulation.  A 
random  number  between  0  and  1  would  be  drawn ,  and  the  first,  value  of  the 
cumulative  transition  probability  matrix  which  equals  or  exceeds  the 
random  number  drawn  is  the  state  which  would  be  selected  for  the  data 


point  being  simulated. 

In  this  case,  the  computer  program  is  included  to  demonstrate  the 
procedures  involved.  The  modifications  to  adapt  this  program  to  deter¬ 
mine  transition  probability  matrices  for  other  simulation  methods  may 
not  be  extensive.  The  input  data  states  would  be  stored  in  a  data  file 
which  is  attached  under  the  local  file  name  STATES.  Four  rows  of  data 
points  would  be  stored  in  each  line  of  the  file,  in  the  format  6011,  so 
three  lines  would  be  used  to  store  a  complete  case.  The  180  data  values 
in  each  case  are  read  in  sequence  and  are  stored  in  the  array  INPUT, 
with  index  numbers  INPUT  (1)  to  INPUT  (180).  The  state  value  recorded 
at  position  (r,c)  is  stored  in  array  position  INPUT  ( 15* ( r  —  1)  +  c). 

The  matrices  used  in  this  program  are  (1)  ITPM,  the  counts  of  all 

J 

occurrences  of  combinations  of  (I,J,K,L),  in  the  configuration  ^  (2) 

TPM,  the  transition  probability  matrix,  (3)  CTPM,  the  cumulative  trans¬ 
ition  probability  matrix,  and  (A)  ISUM,  the  matrix  of  row  sums  in  ITPM 
(the  sum  of  all  occurrences  of  each  I,J,K  combination).  Since  the 
FORTRAN  version  used  at  AFIT.  allows  arrays  of  no  more  than  three  dimen¬ 
sions,  the  I  and  J  state  combinations  are  stored  in  the  first  dimension 
of  the  array,  under  the  name  IJ,  according  to  the  formula  on  line  19  in 
the  computer  program,  with  25  possible  values.  The  variables  LI,  LJ, 

LK,  and  LL  are  the  array  locations  of  the  INPUT  (state)  variables  that 
compose  the  unit  square.  For  example,  the  states  observed  in  the  unit 
square  at  the  northwest  corner  of  the  grid  are  stored  in  INPUT  array  loca¬ 
tions  LI  =  16,  LJ  =  1 ,  LK  =  2,  and  LL  =  17.  Then,  NK  is  the  state  at 
the  point  LK  and  NL  is  the  state  at  point  LL,  with  NIJ  being  a  combina¬ 
tion  of  the  states  at  points  LI  and  LJ.  Lines  21  to  23  increment  the 
appropriate  array  locations  in  ITPM  and  ISUM  and  add  1  to  ITOTAL,  the 
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total  number  of  transitions.  Since  only  14  unit  squares  can  be  con¬ 
tained  in  a  row  of  a  grid  15  points  wide,  line  24  increments  LI  by  1 
at  the  end  of  each  row. 

The  section  from  lines  27  to  37  calculates  the  transition  and 
cumulative  transition  probabilities  from  the  number  of  transitions  and 
the  row  sums.  If  no  occurrences  of  a  particular  (I,  J,K)  combination  are 
observed,  all  transition  and  cumulative  transition  probabilities  are 
zero. 

The  last  section  contains  PRINT  statements.  Variable  ROWEND  in 
lines  64  to  66  is  either  1  or  0,  depending  on  whether  or  not  there  are 
any  occurrences  of  the  particular  (I,J,K)  combination. 

Examples  of  simulations  performed  using  Pickard's  method  are  shown 
in  figure  31.  The  real  data  had  eight  unusual  transitions  of  types  1 
and  2,  since  the  whole  unit  square  was  considered,  in  105  data  fields. 
However,  Pickard's  method  generated  14  such  combinations  in  six  simula¬ 
tions.  The  simulations  look  acceptable  and  are  at  least  as  realistic 
as  those  generated  by  Galbraith  and  Walley's  method.  It  would  be 
difficult  to  determine  if  Pickard's  method  is  sufficiently  better  than 
Galbraith  and  Walley's  method  to  justify  the  additional  computer  storage 
space  involved  except  by  performing  additional  simulations.  However, 
either  method  uses  very  little  computer  time  because  of  the  simplicity 
of  the  operations. 

No  boundary  effects  were  noticed.  If  boundary  effects  occur,  there 
would  be  a  tendency  to  generate  unusual  patterns  near  the  top  and  left 
borders  (which  were  simulated  first  by  first-order  Markov  processes), 
and  the  patterns  would  settle  down  to  more  realism  in  the  interior  of 
the  simulated  data  field.  Since  Pickard  proved  that  his  method  produces 
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stationary  homogeneous  Markov  random  fields,  boundary  effects  would 
not  be  expected.  Similarly,  Galbraith  and  Walley's  method  appeared  to 
produce  stationary  fields,  even  though  they  did  not  deal  with  this 
explicitly  in  their  paper. 

A  "Method  1"  Transition  Matrix 

The  number  of  transitions  used  to  determine  a  transition  proba¬ 
bility  matrix  for  Method  1  can  be  obtained  from  the  number  of  transi¬ 
tions  counted  for  Pickard's  method,  since  Pickard's  method  requires  the 

frequencies  of  all  combinations  of  states  in  a  unit  square.  Denoting 

J  K 

the  states  in  a  unit  square  by  j  L,  the  vector  state  (J,K)  would  be 
the  input  state,  and  (I,L)  would  be  the  output  state.  For  a  five-state 
process,  this  would  give  a  25  x  25  transition  matrix.  Transition 
probabilities  were  developed  only  for  the  five-state  case,  and  the 
transition  probability  matrices  are  given  in  appendix  D. 

Simulations  generated  by  this  method  are  shown  in  figure  32.  In 
each  case,  the  top  row  was  generated  by  a  first-order  horizontal  Markov 
chain,  and  the  unit  squares  defining  the  input  and  output  states  alter¬ 
nated  in  position  from  row  to  row  as  in  figure  8.  Since  the  first-order 
horizontal  Markov  chain  was  found  to  be  reversible,  the  same  transition 
matrix  could  be  used  to  generate  extra  values  needed  at  either  end  of 
a  row. 

In  comparison  with  Galbraith  and  Walley's  and  Pickard's  methods, 
the  data  fields  simulated  by  Method  1  appear  deficient,  basically  by 
generating  far  too  many  unusual  transitions  of  type  1.  This  could  occur, 
for  example,  in  the  row  under  a  row  of  consecutive  values  of  st? „e  A. 

One  (4,4)  pair  could  generate  a  (4,3)  immediately  below  it,  3nd  the 
next  pair  could  generate  a  (5,4)  immediately  below  it,  so  the  set  of 
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Figure  32  -  Typical  Data  Bases  Generated  by  "Method  1 


four  generated  values  would  be  (A, 3, 5, A).  A  sequence  such  as  (3, 3, A, A) 
could  generate  a  sequence  of  (2, 2, 5, 5)  below  it.  If  an  unusual  transi¬ 
tion  of  type  1  occurs,  the  two  states  involved,  such  as  (3,5),  are 
averaged  to  specify  the  points  below  them.  If  the  average  is  not  an 
integer,  the  integer  values  immediately  below  and  above  the  average  are 
used.  For  example,  a  (2,5)  would  generate  a  (3, A)  below  it. 

Discussion  of  the  Problem  of  "Unusual  Transitions" 

All  of  the  two-dimensional  simulation  methods  discussed  here  gen¬ 
erated  unusual  transitions  much  more  frequently  than  in  the  original  data. 
The  number  of  unusual  transitions  of  each  type  was  counted  in  the  simula¬ 
tions  produced  by  each  method  discussed  so  far,  regardless  of  how  the 
data  values  were  simulated.  For  example,  Galbraith  and  Walley’s  method 
could  generate  an  unusual  transition  of  type  3  even  though  the  simula¬ 
tion  of  a  data  value  does  not  consider  a  complete  unit  diamond.  Type  2 
transitions  were  counted  in  all  unit  squares,  and  type  3  transitions  were 
counted  in  all  unit  diamonds.  However,  if  a  type  2  transition  was  found 
in  either  unit  square  that  contained  three  points  of  a  unit  diamond, 
the  transition  was  counted  as  type  2  rather  than  type  3.  For  example, 

the  six  points  5  ^  ^  contain  two  unit  squares,  ^  ^  and  ^  as  well  as 
3  4 

a  unit  diamond  ^  ^  .  One  unusual  transition  of  type  2  would  be  counted 

in  this  structure.  Also,  if  a  type  1  transition  is  found  in  a  unit 
square,  the  unusual  transition  will  be  counted  only  as  type  1.  The 
number  of  unusual  transitions  found  was  as  follows: 

Galbraith  and  Walley:  Type  1  -  7,  Type  2-9,  Type  3  -  1  in  6 
simulations. 

Pickard:  Type  1  -  2,  Type  2  -  12,  Type  3  -  11,  in  6  simulations. 
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Method  1:  Type  1-10,  Type  2  -  none,  Type  3  -  15,  in  4  simulations. 

Real  data:  Type  1  -  2,  Type  2-2,  Type  3  -  8,  in  105  data  fields. 

These  unusual  transitions  occur  frequently  in  simulations  due  to 
the  fact  that  a  Markov  chain  does  not  account  for  correlation  beyond 
the  range  of  the  data  points  used  as  input  states.  An  unusual  transi¬ 
tion  of  type  2  could  be  generated  easily,  for  example,  by  Galbraith  and 
Walley's  method  as  follows.  A  combination  of  3  would  generate  a 
structure  of  ^  ^  with  a  probability  of  .0324.  Assuming  that  the  proba¬ 
bilities  in  a  row  are  the  same  as  first-order  horizontal  transition 
probabilities  given  in  appendix  A,  the  3  in  the  top  row  would  be  followed 
by  a  2  with  a  probability  of  .0679.  This  would  give  a  diamond-shaped 
structure  of  3  £  ,  with  a  joint  probability  of  (.0324)1.0674)  =  .00213, 

assuming  that  these  transitions  are  independent  and  given  that  the  ^  ^ 
combination  occurred. 

With  the  proper  assumptions,  it  is  possible  to  calculate  the 
probability  of  each  different  possible  unusual  transition  of  type  1  or 
2.  For  the  following  discussion,  assume  that  no  unusual  transitions  of 
either  type  1  or  2  occurred  in  the  original  data,  so  that  any  occurrence 
of  either  type  in  simulated  data  would  be  spurious.  Also,  assume  that 
Galbraith  and  Walley's  method  is  to  be  used  for  simulation.  Then,  no 
unusual  transitions  of  any  kind  will  be  generated  within  the  triangle 
of  three  numbers  involved  in  a  single  transition,  such  as  ,  However, 
an  unusual  transition  of  type  2  may  be  generated  in  a  unit  diamond.  The 
12  possible  combinations  which  start  with  a  normal  transition  but  gen¬ 
erate  an  unusual  transition  of  type  2  are  as  follows:  i  *i  ->  2  *** 

45  21  32  43  21  32  43  23  34  .  4  5 

33  ^33  ’  5  5  >23  >  ^  ^  2  ^  2  • 

If  the  method  of  dealing  with  an  unusual  transition  of  type  2  is  to  have 
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the  next  simulated  value  be  the  average  of  the  two  states  involved  in 
the  unusual  transition,  no  unusual  transitions  of  type  1  could  be  gen¬ 
erated.  If  there  were  any  real  unusual  transitions  of  type  1  or  2  in 

the  data  from  which  transition  probabilities  were  determined,  some  more 

3  3  2 

extreme  combinations  could  be  generated,  such  as  ^  ^  or  ^  ^  .  Both 

of  these  could  be  generated  from  the  data  set  used  as  an  example  in  this 
chapter,  since  the  left  three  numbers  in  each  unit  diamond  are  combina¬ 
tions  which  were  observed  in  the  actual  data. 

It  is  desirable  to  see  if  the  probability  of  unusual  transitions 
can  be  calculated  on  an  unconditional  basis.  For  example,  the  uncondi- 

3  2 

tional  probability  of  ^  ^  is  equal  to  the  conditional  probability  of 
3  2  3 

^  ^  ;  given  that  ^  occurs,  multiplied  by  the  unconditional  probability 

3 

of  an  occurrence  of  ^  .  Actually,  this  problem  is  not  analytically 

solvable  at  present,  so  it  will  be  assumed  here  that  the  unconditional 
probability  of  any  j  K  combination  is  equal  to  the  probability  of  that 
combination  in  the  actual  data.  This  can  be  found  from  the  row  sums 
of  the  number  of  transitions  counted  for  Galbraith  and  Walley's  method 
as  in  appendix  B.  For  the  example  above,  the  probability  of  a  ,  J  is 
the  row  sum  where  1=3  and  J  =  3,  divided  by  the  total  number  of  trans¬ 
itions,  which  is  5035/16170  =  .31 14.  Multiplying  by  the  conditional 

3  2 

probability  given  above,  the  unconditional  probability  of  ^  4  is 
( .31 14) ( .00213)  =  .000684.  If  this  is  repeated  for  each  of  the  12 
possible  unusual  transitions  of  type  2,  the  total  unconditional  prob¬ 
ability  of  generating  an  unusual  transition  of  type  2  is  .00S27.  The 
six  simulations  performed  using  Galbraith  and  Walley's  method  contained 
14  x  11x6=  924  points  simulated  by  that  method.  Nine  of  these  were 
unusual  transitions  of  type  2  (all  of  the  type  1  transitions  were 
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generated  due  to  such  transitions  actually  existing  in  the  data),  so 
the  frequency  of  generation  was  9/924  =  .00974.  Only  two  cases  of  an 
unusual  transition  of  type  2  were  found  in  the  data  (excluding  transi¬ 
tions  of  type  1),  so  the  real  probability  would  be  about  2/16170  =  .00012. 

There  are  several  possible  methods  of  dealing  with  this  problem, 
all  of  them  arbitrary.  Actually,  using  the  transition  matrix  in  cases 
where  an  unusual  transition  is  simulated  that  matches  one  observed  in 
the  real  data  may  still  not  be  realistic,  since  the  transition  proba¬ 
bilities  may  be  based  on  only  one  case.  For  example,  using  Galbraith 

3  3 

and  Walley's  method,  an  occurrence  of  ^  always  led  to  ^  so  one 

3  2  3  3  3  3 

structure  of  5  5  was  generated,  and  also  a  long  sequence  of  ^  ^  ^  ^ 

So,  even  when  a  few  unusual  transitions  are  observed  to  actually  occur, 
it  may  be  best  to  neglect  them  in  developing  the  transition  probabil¬ 
ities  and  substitute  an  empirical  procedure. 

The  empirical  procedure  chosen  could  be  used  in  the  simulation 
either  when  all  transition  probabilities  are  zero  or  when  the  transition 
meets  criteria  for  being  "unusual."  Except  when  unusual  transitions  of 
type  2  or  type  1  are  allowed  because  of  such  occurrences  in  the  real 
data,  most  or  all  of  the  unusual  transitions  that  need  to  be  dealt  with 
are  type  2. 

One  procedure  is  the  "compromise"  procedure  described  earlier  in 

this  chapter,  where  the  input  states  are  averaged  to  determine  the 

3  3 

output  state.  For  example,  a  ^  would  result  in  a  structure  of  ^ 

and  a  ,  1  would  lead  to  either  ,  1  or  ,  2. 

4  4  2  4  3 

Another  procedure  would  be  to  prohibit  such  transitions,  and 
"back  up"  and  arbitrarily  select  another  alternative  from  the  last 
transition  that  led  to  the  "unusual  transition."  For  example,  if  a 
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3  3  2  2 

sequence  results  in  ^  4  »  the  ^  is  an  unusual  transition  of  type  2. 

The  program  would  "back  up"  and  replace  the  4  by  a  3  or  a  2.  While 
this  procedure  was  not  used  in  the  simulations  presented  in  this  thesis, 
so  that  the  problem  of  "unusual  transitions"  could  be  demonstrated,  it 
should  produce  more  realistic  data  fields  then  the  "compromise"  pro¬ 
cedure,  which  leaves  the  "unusual  transitions"  in  the  simulated  data 
fields. 

An  Additional  Suggested  Method  -  "Method  2" 

It  appears  that  most  unusual  transitions  are  generated  because  the 
state  at  the  point  immediately  northeast  of  the  data  value  to  be  simu¬ 
lated  (when  simulating  by  rows)  is  not  considered.  Since  Pickard's 
and  Galbraith  and  Walley's  methods  produced  similar  results,  it  appears 
that  the  point  northwest  of  the  point  to  be  simulated  provides  mostly 
redundant  information  since  its  state  value  was  already  used  in  the 
simulation  of  the  previous  point.  Therefore,  a  slight  variation  of 

Pickard's  method  is  suggested,  where  state  L  is  to  be  simulated  based 

J  K 

on  the  values  of  (I,J,K)  in  a  unit  diamond  of  the  form  ^  ^  .  On  the 

right  end  of  each  row,  the  value  of  K  is  not  available,  so  Galbraith 
and  Walley's  method  would  be  used  to  simulate  the  last  point  in  each 
row.  This  method  will  be  referred  to  as  "Method  2." 

Only  a  few  changes  in  the  PICKARD  computer  program  (appendix  C) 
were  needed  to  develop  transition  matrices  for  Method  2.  In  line  13, 
the  DO  statement  goes  from  1  to  13,  since  only  13  transitions  per  row 
can  be  counted.  In  line  24,  LI  =  LI  +  2  because  two  transitions  must 
be  skipped  at  the  end  of  each  row.  In  line  18,  the  array  locations  of 
LJ  and  LK  are  changed  to  LJ  =  LI  -  14  and  LK  =  LI  -  13. 
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The  transition  matrices  are  given  in  appendix  E,  and  examples  of 
simulations  produced  by  this  method  are  shown  in  figure  33.  As  in  pre¬ 
vious  discussion,  an  unusual  transition  of  type  3  is  defined  as  a  case 
where  I  and  K  in  a  unit  diamond  are  unequal  and  nonconsecutive  states. 
This  method  generated  15  unusual  transitions  of  type  3  in  six  simula¬ 
tions,  plus  one  transition  of  type  2.  No  unusual  transitions  of  type  1 
were  generated,  in  contrast  to  all  of  the  other  simulation  methods. 
Seven  of  the  type  3  transitions  matched  sequences  which  occurred  in  the 
real  data.  Based  on  a  subjective  examination  of  the  simulations,  the 
data  fields  produced  by  Method  2  appear  to  be  at  least  as  realistic  as 
those  generated  by  either  Pickard's  or  Galbraith  and  Walley's  methods. 

General  Guidelines  for  Users 

This  chapter  has  described  several  examples  of  applications  of 
two-dimensional  simulation  methods.  As  a  summary  of  this  chapter,  the 
basic  procedures  used  will  be  reviewed  and  the  steps  involved  will  be 
described,  in  general.  Because  of  the  differences  in  data  and  require¬ 
ments  from  one  project  to  another,  this  section  will  not  be  a  detailed 
user's  guide  but  will  simply  describe  some  of  the  factors  to  be  con¬ 
sidered. 

It  will  be  assumed  that  the  data  needed  has  been  selected  and  is 
available.  The  data  should  be  reduced  to  a  small  number  of  states  that 
are  relevant  to  the  simulation.  Also,  it  will  be  assumed  that  one  of 
the  two-dimensional  simulation  methods  (such  as  "Method  2")  has  been 
chosen  as  the  desired  procedure  to  use  for  the  simulation. 

After  the  data  is  collected,  it  is  best  to  look  at  the  character¬ 
istics  of  the  data  (1)  by  plotting  some  or  all  of  the  data  fields,  (2) 
by  examining  statistical  characteristics  at  each  grid  point,  such  as 
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Figure  33  -  Typical  Data  Bases  Generated  by  "Method  2 


mean,  standard  deviation,  skewness,  and  extremes,  (3)  by  examining  the 
correlation  structure  of  the  data,  and  (4)  by  subjectively  applying 
meteorological  knowledge  to  see  if  there  are  significant  discontinuities 
in  the  data.  Not  all  of  the  methods  need  to  be  applied  in  the  same  way 
or  with  the  same  detail  as  shown  in  this  thesis.  The  procedures  in 
this  section  will  be  assumed  to  apply  to  a  data  field  which  can  be 
considered  homogeneous. 

Next,  the  one-dimensional  transition  structure  of  the  data  should 
be  investigated  by  counting  transitions  in  both  horizontal  and  vertical 
directions.  Also,  it  is  useful  to  determine  if  these  matrices  are 
reversible.  Comparing  all  of  these  matrices  will  help  to  verify  the 
homogeneity  of  the  data  base.  If  the  data  base  is  not  homogeneous,  it 
may  be  still  possible  to  simulate  it  realistically,  but  extra  caution 
is  needed  because  most  of  the  theoretical  results  assume  a  homogeneous 
data  base.  Since  two  edges  of  the  data  base  are  to  be  simulated  by 
one-dimensional  processes,  it  may  be  useful  to  express  the  transition 
probabilities  as  cumulative  transition  probabilities,  as  described 
earlier  in  this  chapter.  The  unconditional  probabilita.es  also  should 
be  obtained,  so  that  the  first  point  of  each  data  field  may  be  gener¬ 
ated. 

The  examination  of  the  transition  structure  should  include  inves¬ 
tigating  the  types  of  transitions  that  occur,  particularly  those  that 
are  called  "unusual  transitions  of  type  1"  in  this  chapter.  While,  the 
one-dimensional  Markov  chain  will  not  generate  these  more  often  than  in 
the  real  data,  the  two-dimensional  procedures  nay  generate  these  trans¬ 
itions. 

Assuming  that  the  two-dimensional  simulation  method  has  been 
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chosen,  the  next  step  is  to  develop  the  transition  probability  matrix 
(and  usually  the  cumulative  transition  probability  matrix)  for  the  two- 
dimensional  process  by  counting  transitions  in  the  original  data.  If 
there  are  no  occurrences  of  a  particular  sequence  of  states,  all  of  the 
transition  probabilities  should  be  set  to  zero  (or  a  suitable  flag  in 
the  program  should  be  set)  so  that  the  program  will  not  attempt  to  divide 
by  zero  to  get  transition  probabilities. 

Reviewing  this  transition  probability  matrix  would  show  if  there 
are  many  "unusual  transitions  of  type  2"  (or  possibly  type  3  also,  if 
"Method  2"  is  used  to  develop  the  matrix).  The  method  to  deal  with 
"unusual  transitions"  should  be  decided  on  at  this  time.  The  occurrence 
of  no  transition  probabilities  greater  than  zero  for  a  particular  trans¬ 
ition  could  be  used  as  a  flag  to  detect  the  generation  of  an  "unusual 
transition"  that  did  not  occur  in  the  original  data,  and  possibly  the 
program  could  have  other  flags  to  detect  other  "unusual  transitions." 
While  the  "compromise  method"  of  dealing  with  these  transitions  was 
used  here  for  illustrative  purposes,  prohibiting  the  "unusual  transi¬ 
tions"  by  having  the  program  "back  up"  when  one  occurs  may  produce  more 
realistic  data  fields. 

Hie  last  step  is  to  write  a  computer  program  to  generate  data 
fields,  if  desired.  Generally,  the  simulation  procedure  will  (1) 
select  the  state  at  the  starting  point,  usually  the  upper  left  corner, 

(2)  generate  two  boundaries,  usually  the  left  and  top,  by  one-dimensional 
processes,  and  (3)  generate  the  interior  points  of  the  grid  by  the 
chosen  two-dimensional  method,  dealing  with  "unusual  transitions"  as 
set  up  in  the  program.  This  program  could  then  be  used  to  generate 
two-dimensional  simulated  data  fields. 
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VIII.  Summary,  Conclusions,  and  Recommendations 
f*or  Further  Study 

The  basic  objectives  of  this  thesis  were  to  summarize  the  use  of 
Markov  chains  in  one  dimension,  particularly  as  applied  to  weather 
data,  and  to  extend  the  theory  to  two  (or  more)  dimensions  so  that  the 
advantages  of  Markov  models  could  be  used  to  produce  realistic  but 
computationally  simple  simulations  of  higher  dimensional  data  fields. 

A  general  discussion  of  the  characteristics  of  real  weather  varia¬ 
bles  and  observed  weather  data  is  first  presented,  to  help  familiarize 
users  with  those  characteristics  that  cannot  and  those  that  can  be 
realistically  modeled  by  a  Markov  process.  A  brief  summary  of  the 
theory  of  Markov  and  semi-Markov  processes  is  given  in  the  context  of 
the  characteristics  of  weather  data.  The  results  of  a  literature 
search  of  the  topics  of  applications  of  Markov  and  semi-Markov  pro¬ 
cesses,  mainly  to  weather  data,  are  reported. 

To  investigate  the  possibility  of  extending  Markov  concepts  to 
higher  dimensions,  several  topics  become  especially  important  when  sim¬ 
ulating  homogeneous  multi-dimensional  data  fields.  The  first  topic 
is  reversibility:  Can  the  same  transition  probability  matrix  be  used 
in  all  directions,  assuming  that  the  same  probabilities  exist  in  all 
directions  in  the  real  data?  The  second  topic  is  invariance  of  gener¬ 
ation:  Will  the  probability  distribution  of  states  at  one  point, 
given  the  state  at  another  point  which  has  already  been  specified,  be 
the  same  regardless  of  the  sequence  with  which  the  points  in  between 
are  simulated  (such  as  by  columns  instead  of  by  rows)?  The  third  isse 
is  correlation:  How  much  of  the  correlation  structure  of  the  real  data 
will  be  reproduced  by  the  simulation  method?  The  fourth  topic  discussed 
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concerns  the  consequences  of  inhomogeneity  or  directionality  in  the 
data. 

A  few  papers  about  simulations  of  two-dimensional  data  bases  have 
been  published,  although  only  one  example  ( which  was  not  based  on  a 
Markov  process)  has  been  found  (Wellberry  and  Galbraith  (1973)).  The 
Markovian  methods  of  Galbraith  and  Walley  (1976)  and  Pickard  (1977) 
were  described.  These  methods  are  different  from  the  typical  vector 
representation  of  Markov  processes.  Another  suggested  procedure, 
called  "Method  1,"  was  described. 

Because  little  work  has  been  published  in  the  area  of  applying 
the  theory  to  actual  examples,  the  proposed  simulation  methods  were 
applied  to  a  set  of  temperature  data  on  a  grid  over  the  eastern  and 
central  United  States.  Galbraith  and  Walley' s  method  as  well  as 
Pickard's  method  produced  realistic  data  fields,  with  the  exception 
that  they  generated  certain  combinations  of  points  (  such  as  states  A 
and  2  in  diagonally  adjacent  points)  much  more  frequently  than  these 
combinations  were  observed  in  the  real  data.  "Method  1"  produced  data 
fields  that  were  not  as  realistic  as  those  generated  by  the  other 
methods.  A  variation  on  Pickard's  method,  called  "Method  2,"  greatly 
reduced  the  frequency  of  these  "unusual  transitions,"  without  in¬ 
creasing  the  complexity  of  the  model. 

Three  conclusions  can  be  drawn  from  this  exercise:  (1)  Pickard's 
method,  Galbraith  and  Walley's  method,  and  "Method  2"  produce  realistic 
data  fields,  although  certain  combinations  of  data  values  may  be  pro¬ 
duced  more  frequently  in  the  simulations  than  in  the  actual  data. 

(2)  Any  method  which  does  not  use  all  of  the  closest  data  points  which 
have  already  been  simulated  as  input  states  will  probably  generate 
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more  of  these  "unusual  transitions”  than  a  method  which  does  consider 
all  of  the  adjacent  points.  (3)  Methods  of  the  complexity  used  here 
as  examples  are  quite  feasible  for  economical  computer  simulation. 
However,  the  complexity  increases  rapidly  either  as  the  number  of  input 
points  increases  or  the  number  of  different  states  increases.  In  such 
cases,  regression  methods  (possibly  using  dumny  variables  and  some  of 
the  interaction  terms)  may  be  more  appropriate  than  Markov  methods. 

Some  areas  for  further  investigation  may  be  suggested  from  this 
thesis : 

(1)  In  a  theoretical  area,  it  would  be  desirable  to  prove  or 
disprove  the  homogeneity  of  the  simulation  schemes  (other  than  Pickard's) 
discussed  here,  when  a  homogeneous  data  base  is  used  to  develop  the 
transition  probabilities. 

(2)  It  would  be  interesting  to  investigate  a  data  base  for  which 
the  "unusual  transitions”  as  discussed  here  are  not  infrequent.  In 
such  a  case,  would  the  various  simulation  methods  continue  to  generate 
these  "unusual  transitions”  more  frequently  than  in  the  actual  data,  and 
would  the  simulations  be  realistic?  A  possible  technique  to  evaluate  a 
proposed  simulation  method  would  be  to  generate  a  large  number  of  cases, 
develop  transition  matrices  from  the  simulated  data,  and  compare  the 
matrices  with  those  obtained  from  the  real  data. 

(3)  When  a  data  field  is  not  homogeneous,  it  is  divided  into 
areas  that  are  considered  homogeneous  and  separate  transition  matrices 
are  used  in  each  area.  It  would  be  useful  to  investigate  if  this  causes 
any  theoretical  problems  such  as  "boundary  effects." 

(4)  The  application  of  these  methods  to  a  data  field  that  is 
fundamentally  inhomogeneous  (such  as  the  probability  of  icing, 
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considering  a  data  field  with  a  considerable  north-south  extent,  or 
the  probability  of  snow,  or  almost  any  data  field  in  a  mountainous 
area)  could  be  investigated  to  see  if  reasonable  results  can  be  produced. 

(5)  Statistical  testing  procedures  suitable  for  testing  the  good¬ 
ness  of  fit  of  a  two-dimensional  simulation  procedure  should  be  devel¬ 
oped.  While  these  may  be  similar  to  the  statistical  tests  used  for 
one-dimensional  goodness  of  fit,  other  factors  may  need  to  be  consid¬ 
ered,  such  as  the  impact  of  "unusual  transitions"  that  can  be  generated 
even  if  they  do  not  occur  in  the  original  data. 
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APPENDIX  A 

First  Order  Markov 
Three-State  and  Five-State 
Transition  Matrices 


FIRST  ORDER'  THREE  STATES'  HORIZONTAL  FIRST  ORDER'  THREE  STATES'  VERTICAL 
TRANSITIONS  TRANSITIONS 


FIRST  ORDER »  FIVE  STATESf  HORIZONTAL 
TRANSITIONS 

NUMBER  OF  TRANSITIONS 


I 

J=»  1 

J=2 

J=3 

J=4 

J=5 

ROW  SUM 

1 

1035 

224 

0 

0 

0 

1259 

2 

209 

3525 

424 

0 

0 

4158 

3 

0 

449 

5722 

446 

0 

6617 

4 

0 

0 

474 

3799 

191 

4464 

5 

0 

0 

0 

165 

977 

1142 

TOTAL  NUMBER  OF  TRANSITIONS  17640 

TRANSITION  PROBABILITY  MATRIX 
I  J=1  J=2  J=3  J=4  J=5 

1  .8221  .1779  .0  .0  .0 

2  .0503  .8478  .1020  .0  .0 

3  .0  .0679  .8647  .0674  .0 

4  .0  .0  .1062  .8510  .0428 

5  .0  .0  .0  .1445  .8555 

UNCONDITIONAL  PROBABILITIES 
CALCULATED  FROM  TPM 

J=1  J=2  J=3  J=4  J=5 

.0702  .2486  .3737  .2372  .0702 

ACTUAL  PROBABILITIES 

.0714  .2357  .3751  .2531  .0647 

THEORETICAL  PROBABILITIES 

.0668  .2417  .3829  .2417  .0668 
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APPENDIX  B 


Galbraith  and  Walley 
Three-State  and  Five-State 
Transition  Matrices 


GALBRAITH  AND  WALLEY 
THREE  STATES 


NUMBER  OF  TRANSITIONS 


I 

J 

N  -  1  t 

<  *  2 

K  ®  3 

ROW  SUM 

1 

1 

3241 

118 

0 

3359 

1  2 

196 

221 

0 

417 

1 

3 

0 

0 

0 

0 

•L 

1 

208 

237 

0 

445 

2 

2 

145 

7369 

146 

7660 

2  3 

0 

259 

227 

486 

3 

1 

0 

0 

0 

0 

3  2 

0 

241 

197 

438 

3 

3 

0 

138 

■**■*'■*  7 

O  tu  / 

3365 

TOTAL 

NUMBER  OF 

TRANSITIONS 

16170 

TRANSITION  PROBABILITY  MATRIX 


1  1  .9649 

1  2  .4700 

1  3  .0 

2  1  .4674 

2  2  .0189 

2  3.0 

3  1  .0 

3  2  .0 

3  3  .0 


N  «  2 

K  «  3 

.0351 

.0 

.5300 

.0 

.0 

.0 

.5326 

.0 

.9620 

.0191 

.5329 

.4671 

.0 

.0 

.5502 

.  4490 

.0410 

.9590 

GALBRAITH  AND  WALLFY 
FIVE  STATES 


NUMBER  OF  TRANSITIONS 


I 

J 

N  =  1 

N  *  2 

K  =  3 

N  -  4  1 

<  =  5 

ROU  SUM 

1 

1 

822 

67 

0 

0 

0 

889 

1 

2 

123 

141 

0 

0 

0 

269 

2 

1 

102 

161 

0 

0 

0 

263 

2 

2 

91 

2811 

143 

0 

0 

3045 

2 

3 

0 

240 

242 

0 

0 

482 

3 

2 

0 

250 

253 

0 

0 

503 

3 

3 

0 

159 

4713 

163 

0 

5035 

3 

4 

0 

0 

308 

246 

0 

554 

4 

2 

0 

0 

1 

0 

0 

1 

4 

3 

0 

0 

291 

245 

1 

537 

4 

4 

0 

0 

146 

3128 

78 

3352 

4 

5 

0 

0 

0 

112 

94 

206 

5 

3 

0 

0 

0 

0 

1 

1 

5 

4 

0 

0 

0 

97 

06 

183 

5 

5 

0 

0 

0 

53 

797 

850 

TRANSITION  1 

PROBABILITY 

MATRIX 

I 

J 

N  =  1 

N  =  2 

K  *  3 

K  =  4 

N  =  5 

1 

1 

.9246 

.0754 

.0 

.0 

.0 

1 

2 

.4758 

.5242 

.0 

.0 

.0 

2 

1 

.3878 

.6122 

.0 

.0 

.0 

2 

2 

.0299 

.9232 

.0470 

.0 

.0 

2 

3 

.0 

.4979 

.5021 

.0 

.0 

3 

2 

.0 

.4970 

.5030 

.0 

.0 

3 

3 

.0 

.0316 

.9360 

.0324 

.0 

3 

4 

.0 

.0 

.5560 

.  4440 

.0 

4 

2 

.0 

.0 

1.0 

.0 

.0 

4 

3 

.0 

.0 

.5419 

.4562 

.0019 

4 

4 

.0 

.0 

.0436 

.937  2 

.0233 

4 

5 

.0 

.0 

.0 

.5437 

.4563 

5 

3 

.0 

.0 

.0 

.0  J 

L  .0 

5 

4 

.0 

.0 

.0 

.5301 

.  4699 

5 

5 

.0 

.0 

.0 

.0624 

.9376 
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Transition  Matrices 
and 

Computer  Program  and 


l 

t 

* 


Transition  Matrices 
for  Five  States 


U  U  bl  U  M  Kl  M  M  M 


PICKARD 
THREE  STATES 

NUHBER  OF  TRANSITIONS 

IJK  L  *  1  L  *  2  L  =  3  ROW  SUM 


1  1 

1 

3135 

111 

0  3246 

1  1 

2 

83 

118 

0  201 

1  2 

1 

106 

7 

0  113 

1  2 

2 

113 

103 

0  216 

2  1 

1 

101 

111 

0  212 

2  1 

2 

10 

128 

0  138 

2  2 

1 

107 

126 

0  233 

2  2 

2 

135 

7093 

138  7366 

2  2 

3 

0 

109 

131  240 

2  3 

2 

0 

148 

8  156 

2  3 

3 

0 

150 

96  246 

3  2 

2 

0 

109 

99  208 

3  2 

3 

0 

3 

127  130 

3  3 

2 

0 

132 

98  230 

3  3 

3 

0 

135 

3100  3235 

TOTAL 

NUMBER  OF 

TRANSITIONS  16170 

TRANSITION  PROBABILITY  MATRIX 


I 

J 

K 

r 

n 

L  *  2 

L  =  3 

1 

1 

1 

.9658 

.0342 

.0 

1 

1 

2 

.4129 

.5871 

.0 

1 

2 

1 

.9381 

.0619 

.0 

1 

2 

2 

.5231 

.4769 

.0 

2 

1 

1 

.4764 

.  5236 

.0 

2 

1 

2 

.0725 

.9275 

.0 

2 

2 

1 

.4592 

.5408 

.0 

2 

2 

2 

.0183 

.9629 

.0187 

2 

2 

3 

.0 

.4542 

.5458 

2 

3 

2 

.0 

.9487 

.0513 

2 

3 

3 

.0 

.6098 

.3902 

3 

2 

2 

.0 

.5240 

.4760 

3 

2 

3 

.0 

.0231 

.9769 

3 

3 

2 

.0 

.5739 

.4261 

3 

3 

3 

.0 

.0417 

.9583 
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CUMULATIVE  TTANSITION  PKOB*e*liTY  MTRIV 
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METHOD  1'  TRANSITION  MATRIX 


NUMBER  OF  TRANSITIONS 


JK  m 

INPUT 

STATE* 

IL  = 

OUTPUT 

STATE 

JK 

IL 

NUMBER 

IL 

NUMBER 

IL 

NUMBER 

IL  NUMBER 

TOTAL 

11 

11 

759 

12 

62 

21 

49 

22 

83 

953 

12 

11 

51 

12 

68 

21 

6 

85 

23 

1 

211 

21 

11 

63 

12 

5 

21 

53 

22 

78 

199 

22 

11 

77 

12 

73 

21 

84 

22 
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23 

134 

32 

118 

33 

117 

3200 

23 

22 

94 

23 

122 

32 

4 

33 

161 

381 

32 

21 

1 

22 

129 

23 
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32 

132 

33 

136 

43 

1 

407 

33 

22 

146 

23 
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32 

155 

33 

4407 

34 

146 

43 

148 

44 

120 

5242 

34 

33 

138 

34 

130 

43 

8 

44 

127 

45 

1 

55 

2 

406 

43 

33 
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34 

17 

43 

143 

44 

125 

45 

1 

55 

1 

432 

44 

33 
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34 

116 

43 

138 

44 

2941 

45 

73 

54 

43 

55 

54 
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45 

44 

51 

45 

59 

54 

3 

55 

58 
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54 

44 

60 

45 

4 

54 

54 

55 

30 
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55 

44 

61 

45 

35 

54 

50 

55 

739 
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TOTAL 

NUMBER  OF  TRANSITIONS 

16170 
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METHOD  1  *  TRANSITION  MATRIX 


' 


TRANSITION  PROBABILITIES 


JN  *  INPUT  STATE t  IL  =  OUTPUT  STATE 


JK 

IL 
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IL 
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APPENDIX  E 

"Method  2V_ 
Five-State 


Transition  Matrices 


ROM  SUM 


NUMBER  OF  TRANSITIONS 
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0 

h 
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C 
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WIRE*  OF  TRANSITIONS 


I  J  K 


L*1  L*2  L»3  L*4 
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0 

0 

1 

r 

0 

1 

0 

0 

0 

c 

0 

0 

0 

0 

0 

0 

0 

C 

0 

0 

0 

0 

0 

C 

0 

0 

0 

0 

0 

0 

c 

c 

2 

0 

0 

2 

0 

0 
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171 

0 

4  25 

0 

0 

17 

52 

1 

70 

0 

0 

0 

c 

0 

C 

c 

0 

0 

b 

0 

9 

0. 

0 

6 

0 

0 

0 

e 

0 

66 
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1 
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0 

0 

65 
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40 
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96 

34 
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0 

0 

0 

0 

9 

P 

0 

0 

0 

0 

0 
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0 

0 
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0 

0 
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0 

41 

14 

54 

p 

a 

0 

63 

76 
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NUMBER  OF  TRANSITIONS 


I 

J 

K 

1*1 

L*2 

L=3 

L*t 

L*5 

RON  SUM 

5 

1 

1 

0 
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0 

0 

0 

0 

5 

1 

2 
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5 

1 

3 

0 
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0 
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4 
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0 
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c 

5 

2 

5 

0 

0 

0 

c 

0 

0 

5 

3 

1 

0 

0 

0 

0 

0 

0 

5 

3 

2 

0 

0 

0 

0 

0 

r 

5 

3 

3 

0 

0 

0 

0 

0 

0 

5 

3 

4 

0 

0 

0 

0 

1 

l 

5 

3 

5 

0 

0 

0 

c 

0 

t 
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1 
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c 
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1 

0 

i 
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4 

4 

0 

0 
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76 
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5 

4 

5 

0 

0 

0 
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24 

3C 

5 

5 

1 

0 

0 

0 

6 

0 

C 

5 

5 

2 

0 

0 

0 

0 

0 

C 

5 

5 

3 

0 

0 
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0 
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0 

5 

5 

4 

0 

0 

0 

23 

64 

67 

5 

5 

5 

0 

0 

0 

27 
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TOTAL  NUMBER  OF  TRANSITIONS  IS  15015 
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5 
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1. OGOO 

.5936 

1. 3000 

1.3000 

.2*25 

.9857 

i.  o c  no 

0.0131 

G.0090 

O.OCOO 

U.OCCL 

0.9000 

3. 0300 

O.OOOC 

0.0030 

3. OGJO 

.2491 

.9963 

1.0000 

•  0  239 

.9853 

1. 3000 

O.OCOO 

.7385 

1. 3JU0 

0.0  GOO 

O.UOOO 

0.0000 

0.0000 

0.0030 

3.0000 

o.ooco 

0. GOUO 

a.  ocoo 

O.OGCC 

.7407 

1. OuOO 

0.0  GOO 

.4  468 

1.0003 
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CUMULATIVE  TRANSITION  PROBABILITY  MATRIX 


I  J  K 

L  =  1  L  =  2 

511  o.oooo  o.oooc 

512  0.00P9  0.0J00 

513  0.0000  0.0000 

5  1  4  O.OOOC  O.OOOC 

515  0.0000  C.OOOC 

521  o.oroo  O.OOCO 

522  0.0000  O.COOO 

523  0.0000  O.OJUG 

524  O.OCOO  0.0000 

525  O.COOO  O.OOOG 

531  0.0000  0.0030 

532  0.0000  O.OOOC 

533  0.0003  O.OOCO 

5  3  4  0.  0000  0.000  0 

535  0.0000  0.0000 

541  0.0300  0.3000 

542  O.OOOC  0.3300 

543  O.OCOO  0.0303 

544  0.0000  0.0000 

545  O.OOOG  0.0300 

551  0.0003  0.3300 

552  3.0030  0.0300 

553  0.3000  0.0003 

554  O.OOOC  0.0000 

555  0.0000  0.0300 


L  =  3  L  =  4  L  =  5 

O.OOOC  0.3000  0.0000 
0.013G  G. 30 JO  3.00  00 
U.OtOG  C. 3003  3.0000 
O.OOOG  0.3000  3.3000 
O.OOuO  0.0030  3.0330 
O.OOOC  0.3030  O.OCOO 
O.OOOG  0.0030  3.0000 
O.OOOC  0.0003  3. 3000 
O.OuOO  0.0030  3.0000 
0.0000  0.0030  3.0000 
C.OOCl  UGCOQ  0 . 00  CO 
0.3033  0.0033  O.OCOO 
O.OOCO  G. 2000  O.OCOO 
O.OOOC  O.COOO  l.OGGO 
O.OCOO  0  •  u  000  O.OCOO 
O.OCOO  O.OOOG  3.3  303 
O.OOCO  0 . 0 0o3  3.0003 
O.OCOO  1.0030  1.3003 
O.GuOC  .5930  l.OtlCO 
O.OOOC  .2000  1.0GJ3 
C.OOOC  0.0003  3.0300 
U.OGOO  O.jOOC  O.OlOO 
Q.OGOC  0.0030  O.OOCO 
O.OOGC  .26*h  1.0000 
O.OLOO  .0384  l.OOCO 
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